All of lore.kernel.org
 help / color / mirror / Atom feed
* [Ocfs2-devel] [PATCH 0/5] OCFS2 quota fixes
@ 2009-01-12 22:20 Jan Kara
  2009-01-12 22:20 ` [Ocfs2-devel] [PATCH 1/5] quota: Improve locking Jan Kara
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Jan Kara @ 2009-01-12 22:20 UTC (permalink / raw)
  To: ocfs2-devel

Hello,

the following series of patches fixes some issues with OCFS2 quotas.
The first patch modifies VFS quota locking, the next patch uses the
fact to simplify OCFS2 quota locking and solves a few deadlock issues.
The third and the fourth patches fix another possible deadlocks in OCFS2
quota code and the last patch is a minor cleanup.

								Honza

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Ocfs2-devel] [PATCH 1/5] quota: Improve locking
  2009-01-12 22:20 [Ocfs2-devel] [PATCH 0/5] OCFS2 quota fixes Jan Kara
@ 2009-01-12 22:20 ` Jan Kara
  2009-01-12 22:20   ` [Ocfs2-devel] [PATCH 2/5] ocfs2: Remove ocfs2_dquot_initialize() and ocfs2_dquot_drop() Jan Kara
  2009-01-13  6:01 ` [Ocfs2-devel] [PATCH 0/5] OCFS2 quota fixes tristan.ye
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 14+ messages in thread
From: Jan Kara @ 2009-01-12 22:20 UTC (permalink / raw)
  To: ocfs2-devel

We implement dqget() and dqput() that need neither dqonoff_mutex nor dqptr_sem.
Then move dqget() and dqput() calls so that they are not called from under
dqptr_sem. This is important because filesystem callbacks aren't called from
under dqptr_sem which used to cause *lots* of problems with lock ranking
(and with OCFS2 they became close to unsolvable).

The patch also removes two functions which were introduced solely because OCFS2
needed them to cope with the old locking scheme. As time showed, they were not
enough for OCFS2 anyway and it would be unnecessary work to adapt them to the
new locking scheme in which they aren't needed.  As a result OCFS2 needs the
following patch to compile properly with quotas.  Sorry to any bisecters which
hit this in advance.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/dquot.c               |  218 ++++++++++++++++++++++++++--------------------
 include/linux/quotaops.h |    2 -
 2 files changed, 122 insertions(+), 98 deletions(-)

diff --git a/fs/dquot.c b/fs/dquot.c
index 48c0571..bca3cac 100644
--- a/fs/dquot.c
+++ b/fs/dquot.c
@@ -87,14 +87,17 @@
 #define __DQUOT_PARANOIA
 
 /*
- * There are two quota SMP locks. dq_list_lock protects all lists with quotas
- * and quota formats and also dqstats structure containing statistics about the
- * lists. dq_data_lock protects data from dq_dqb and also mem_dqinfo structures
- * and also guards consistency of dquot->dq_dqb with inode->i_blocks, i_bytes.
+ * There are three quota SMP locks. dq_list_lock protects all lists with quotas
+ * and quota formats, dqstats structure containing statistics about the lists
+ * dq_data_lock protects data from dq_dqb and also mem_dqinfo structures and
+ * also guards consistency of dquot->dq_dqb with inode->i_blocks, i_bytes.
  * i_blocks and i_bytes updates itself are guarded by i_lock acquired directly
- * in inode_add_bytes() and inode_sub_bytes().
+ * in inode_add_bytes() and inode_sub_bytes(). dq_state_lock protects
+ * modifications of quota state (on quotaon and quotaoff) and readers who care
+ * about latest values take it as well.
  *
- * The spinlock ordering is hence: dq_data_lock > dq_list_lock > i_lock
+ * The spinlock ordering is hence: dq_data_lock > dq_list_lock > i_lock,
+ *   dq_list_lock > dq_state_lock
  *
  * Note that some things (eg. sb pointer, type, id) doesn't change during
  * the life of the dquot structure and so needn't to be protected by a lock
@@ -103,12 +106,7 @@
  * operation is just reading pointers from inode (or not using them at all) the
  * read lock is enough. If pointers are altered function must hold write lock
  * (these locking rules also apply for S_NOQUOTA flag in the inode - note that
- * for altering the flag i_mutex is also needed).  If operation is holding
- * reference to dquot in other way (e.g. quotactl ops) it must be guarded by
- * dqonoff_mutex.
- * This locking assures that:
- *   a) update/access to dquot pointers in inode is serialized
- *   b) everyone is guarded against invalidate_dquots()
+ * for altering the flag i_mutex is also needed).
  *
  * Each dquot has its dq_lock mutex. Locked dquots might not be referenced
  * from inodes (dquot_alloc_space() and such don't check the dq_lock).
@@ -122,10 +120,17 @@
  * Lock ordering (including related VFS locks) is the following:
  *   i_mutex > dqonoff_sem > journal_lock > dqptr_sem > dquot->dq_lock >
  *   dqio_mutex
+ * The lock ordering of dqptr_sem imposed by quota code is only dqonoff_sem >
+ * dqptr_sem. But filesystem has to count with the fact that functions such as
+ * dquot_alloc_space() acquire dqptr_sem and they usually have to be called
+ * from inside a transaction to keep filesystem consistency after a crash. Also
+ * filesystems usually want to do some IO on dquot from ->mark_dirty which is
+ * called with dqptr_sem held.
  * i_mutex on quota files is special (it's below dqio_mutex)
  */
 
 static DEFINE_SPINLOCK(dq_list_lock);
+static DEFINE_SPINLOCK(dq_state_lock);
 DEFINE_SPINLOCK(dq_data_lock);
 
 static char *quotatypes[] = INITQFNAMES;
@@ -428,7 +433,7 @@ static inline void do_destroy_dquot(struct dquot *dquot)
  * quota is disabled and pointers from inodes removed so there cannot be new
  * quota users. There can still be some users of quotas due to inodes being
  * just deleted or pruned by prune_icache() (those are not attached to any
- * list). We have to wait for such users.
+ * list) or parallel quotactl call. We have to wait for such users.
  */
 static void invalidate_dquots(struct super_block *sb, int type)
 {
@@ -600,7 +605,6 @@ static struct shrinker dqcache_shrinker = {
 /*
  * Put reference to dquot
  * NOTE: If you change this function please check whether dqput_blocks() works right...
- * MUST be called with either dqptr_sem or dqonoff_mutex held
  */
 void dqput(struct dquot *dquot)
 {
@@ -697,36 +701,30 @@ static struct dquot *get_empty_dquot(struct super_block *sb, int type)
 }
 
 /*
- * Check whether dquot is in memory.
- * MUST be called with either dqptr_sem or dqonoff_mutex held
- */
-int dquot_is_cached(struct super_block *sb, unsigned int id, int type)
-{
-	unsigned int hashent = hashfn(sb, id, type);
-	int ret = 0;
-
-        if (!sb_has_quota_active(sb, type))
-		return 0;
-	spin_lock(&dq_list_lock);
-	if (find_dquot(hashent, sb, id, type) != NODQUOT)
-		ret = 1;
-	spin_unlock(&dq_list_lock);
-	return ret;
-}
-
-/*
  * Get reference to dquot
- * MUST be called with either dqptr_sem or dqonoff_mutex held
+ *
+ * Locking is slightly tricky here. We are guarded from parallel quotaoff()
+ * destroying our dquot by:
+ *   a) checking for quota flags under dq_list_lock and
+ *   b) getting a reference to dquot before we release dq_list_lock
  */
 struct dquot *dqget(struct super_block *sb, unsigned int id, int type)
 {
 	unsigned int hashent = hashfn(sb, id, type);
-	struct dquot *dquot, *empty = NODQUOT;
+	struct dquot *dquot = NODQUOT, *empty = NODQUOT;
 
         if (!sb_has_quota_active(sb, type))
 		return NODQUOT;
 we_slept:
 	spin_lock(&dq_list_lock);
+	spin_lock(&dq_state_lock);
+	if (!sb_has_quota_active(sb, type)) {
+		spin_unlock(&dq_state_lock);
+		spin_unlock(&dq_list_lock);
+		goto out;
+	}
+	spin_unlock(&dq_state_lock);
+
 	if ((dquot = find_dquot(hashent, sb, id, type)) == NODQUOT) {
 		if (empty == NODQUOT) {
 			spin_unlock(&dq_list_lock);
@@ -735,6 +733,7 @@ we_slept:
 			goto we_slept;
 		}
 		dquot = empty;
+		empty = NODQUOT;
 		dquot->dq_id = id;
 		/* all dquots go on the inuse_list */
 		put_inuse(dquot);
@@ -749,8 +748,6 @@ we_slept:
 		dqstats.cache_hits++;
 		dqstats.lookups++;
 		spin_unlock(&dq_list_lock);
-		if (empty)
-			do_destroy_dquot(empty);
 	}
 	/* Wait for dq_lock - after this we know that either dquot_release() is already
 	 * finished or it will be canceled due to dq_count > 1 test */
@@ -758,11 +755,15 @@ we_slept:
 	/* Read the dquot and instantiate it (everything done only if needed) */
 	if (!test_bit(DQ_ACTIVE_B, &dquot->dq_flags) && sb->dq_op->acquire_dquot(dquot) < 0) {
 		dqput(dquot);
-		return NODQUOT;
+		dquot = NODQUOT;
+		goto out;
 	}
 #ifdef __DQUOT_PARANOIA
 	BUG_ON(!dquot->dq_sb);	/* Has somebody invalidated entry under us? */
 #endif
+out:
+	if (empty)
+		do_destroy_dquot(empty);
 
 	return dquot;
 }
@@ -1198,63 +1199,76 @@ static int info_bdq_free(struct dquot *dquot, qsize_t space)
 }
 /*
  *	Initialize quota pointers in inode
- *	Transaction must be started at entry
+ *	We do things in a bit complicated way but by that we avoid calling
+ *	dqget() and thus filesystem callbacks under dqptr_sem.
  */
 int dquot_initialize(struct inode *inode, int type)
 {
 	unsigned int id = 0;
 	int cnt, ret = 0;
+	struct dquot *got[MAXQUOTAS] = { NODQUOT, NODQUOT };
+	struct super_block *sb = inode->i_sb;
 
 	/* First test before acquiring mutex - solves deadlocks when we
          * re-enter the quota code and are already holding the mutex */
 	if (IS_NOQUOTA(inode))
 		return 0;
-	down_write(&sb_dqopt(inode->i_sb)->dqptr_sem);
+
+	/* First get references to structures we might need. */
+	for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
+		if (type != -1 && cnt != type)
+			continue;
+		switch (cnt) {
+		case USRQUOTA:
+			id = inode->i_uid;
+			break;
+		case GRPQUOTA:
+			id = inode->i_gid;
+			break;
+		}
+		got[cnt] = dqget(sb, id, cnt);
+	}
+
+	down_write(&sb_dqopt(sb)->dqptr_sem);
 	/* Having dqptr_sem we know NOQUOTA flags can't be altered... */
 	if (IS_NOQUOTA(inode))
 		goto out_err;
 	for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
 		if (type != -1 && cnt != type)
 			continue;
+		/* Avoid races with quotaoff() */
+		if (!sb_has_quota_active(sb, cnt))
+			continue;
 		if (inode->i_dquot[cnt] == NODQUOT) {
-			switch (cnt) {
-				case USRQUOTA:
-					id = inode->i_uid;
-					break;
-				case GRPQUOTA:
-					id = inode->i_gid;
-					break;
-			}
-			inode->i_dquot[cnt] = dqget(inode->i_sb, id, cnt);
+			inode->i_dquot[cnt] = got[cnt];
+			got[cnt] = NODQUOT;
 		}
 	}
 out_err:
-	up_write(&sb_dqopt(inode->i_sb)->dqptr_sem);
+	up_write(&sb_dqopt(sb)->dqptr_sem);
+	/* Drop unused references */
+	for (cnt = 0; cnt < MAXQUOTAS; cnt++)
+		dqput(got[cnt]);
 	return ret;
 }
 
 /*
  * 	Release all quotas referenced by inode
- *	Transaction must be started@an entry
  */
-int dquot_drop_locked(struct inode *inode)
+int dquot_drop(struct inode *inode)
 {
 	int cnt;
+	struct dquot *put[MAXQUOTAS];
 
+	down_write(&sb_dqopt(inode->i_sb)->dqptr_sem);
 	for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
-		if (inode->i_dquot[cnt] != NODQUOT) {
-			dqput(inode->i_dquot[cnt]);
-			inode->i_dquot[cnt] = NODQUOT;
-		}
+		put[cnt] = inode->i_dquot[cnt];
+		inode->i_dquot[cnt] = NODQUOT;
 	}
-	return 0;
-}
-
-int dquot_drop(struct inode *inode)
-{
-	down_write(&sb_dqopt(inode->i_sb)->dqptr_sem);
-	dquot_drop_locked(inode);
 	up_write(&sb_dqopt(inode->i_sb)->dqptr_sem);
+
+	for (cnt = 0; cnt < MAXQUOTAS; cnt++)
+		dqput(put[cnt]);
 	return 0;
 }
 
@@ -1470,8 +1484,9 @@ int dquot_transfer(struct inode *inode, struct iattr *iattr)
 	qsize_t space;
 	struct dquot *transfer_from[MAXQUOTAS];
 	struct dquot *transfer_to[MAXQUOTAS];
-	int cnt, ret = NO_QUOTA, chuid = (iattr->ia_valid & ATTR_UID) && inode->i_uid != iattr->ia_uid,
-	    chgid = (iattr->ia_valid & ATTR_GID) && inode->i_gid != iattr->ia_gid;
+	int cnt, ret = QUOTA_OK;
+	int chuid = iattr->ia_valid & ATTR_UID && inode->i_uid != iattr->ia_uid,
+	    chgid = iattr->ia_valid & ATTR_GID && inode->i_gid != iattr->ia_gid;
 	char warntype_to[MAXQUOTAS];
 	char warntype_from_inodes[MAXQUOTAS], warntype_from_space[MAXQUOTAS];
 
@@ -1479,21 +1494,11 @@ int dquot_transfer(struct inode *inode, struct iattr *iattr)
          * re-enter the quota code and are already holding the mutex */
 	if (IS_NOQUOTA(inode))
 		return QUOTA_OK;
-	/* Clear the arrays */
+	/* Initialize the arrays */
 	for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
-		transfer_to[cnt] = transfer_from[cnt] = NODQUOT;
+		transfer_from[cnt] = NODQUOT;
+		transfer_to[cnt] = NODQUOT;
 		warntype_to[cnt] = QUOTA_NL_NOWARN;
-	}
-	down_write(&sb_dqopt(inode->i_sb)->dqptr_sem);
-	/* Now recheck reliably when holding dqptr_sem */
-	if (IS_NOQUOTA(inode)) {	/* File without quota accounting? */
-		up_write(&sb_dqopt(inode->i_sb)->dqptr_sem);
-		return QUOTA_OK;
-	}
-	/* First build the transfer_to list - here we can block on
-	 * reading/instantiating of dquots.  We know that the transaction for
-	 * us was already started so we don't violate lock ranking here */
-	for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
 		switch (cnt) {
 			case USRQUOTA:
 				if (!chuid)
@@ -1507,6 +1512,13 @@ int dquot_transfer(struct inode *inode, struct iattr *iattr)
 				break;
 		}
 	}
+
+	down_write(&sb_dqopt(inode->i_sb)->dqptr_sem);
+	/* Now recheck reliably when holding dqptr_sem */
+	if (IS_NOQUOTA(inode)) {	/* File without quota accounting? */
+		up_write(&sb_dqopt(inode->i_sb)->dqptr_sem);
+		goto put_all;
+	}
 	spin_lock(&dq_data_lock);
 	space = inode_get_bytes(inode);
 	/* Build the transfer_from list and check the limits */
@@ -1517,7 +1529,7 @@ int dquot_transfer(struct inode *inode, struct iattr *iattr)
 		if (check_idq(transfer_to[cnt], 1, warntype_to + cnt) ==
 		    NO_QUOTA || check_bdq(transfer_to[cnt], space, 0,
 		    warntype_to + cnt) == NO_QUOTA)
-			goto warn_put_all;
+			goto over_quota;
 	}
 
 	/*
@@ -1545,28 +1557,37 @@ int dquot_transfer(struct inode *inode, struct iattr *iattr)
 
 		inode->i_dquot[cnt] = transfer_to[cnt];
 	}
-	ret = QUOTA_OK;
-warn_put_all:
 	spin_unlock(&dq_data_lock);
+	up_write(&sb_dqopt(inode->i_sb)->dqptr_sem);
+
 	/* Dirtify all the dquots - this can block when journalling */
 	for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
 		if (transfer_from[cnt])
 			mark_dquot_dirty(transfer_from[cnt]);
-		if (transfer_to[cnt])
+		if (transfer_to[cnt]) {
 			mark_dquot_dirty(transfer_to[cnt]);
+			/* The reference we got is transferred to the inode */
+			transfer_to[cnt] = NODQUOT;
+		}
 	}
+warn_put_all:
 	flush_warnings(transfer_to, warntype_to);
 	flush_warnings(transfer_from, warntype_from_inodes);
 	flush_warnings(transfer_from, warntype_from_space);
-	
+put_all:
 	for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
-		if (ret == QUOTA_OK && transfer_from[cnt] != NODQUOT)
-			dqput(transfer_from[cnt]);
-		if (ret == NO_QUOTA && transfer_to[cnt] != NODQUOT)
-			dqput(transfer_to[cnt]);
+		dqput(transfer_from[cnt]);
+		dqput(transfer_to[cnt]);
 	}
-	up_write(&sb_dqopt(inode->i_sb)->dqptr_sem);
 	return ret;
+over_quota:
+	spin_unlock(&dq_data_lock);
+	up_write(&sb_dqopt(inode->i_sb)->dqptr_sem);
+	/* Clear dquot pointers we don't want to dqput() */
+	for (cnt = 0; cnt < MAXQUOTAS; cnt++)
+		transfer_from[cnt] = NODQUOT;
+	ret = NO_QUOTA;
+	goto warn_put_all;
 }
 
 /* Wrapper for transferring ownership of an inode */
@@ -1651,19 +1672,24 @@ int vfs_quota_disable(struct super_block *sb, int type, unsigned int flags)
 			continue;
 
 		if (flags & DQUOT_SUSPENDED) {
+			spin_lock(&dq_state_lock);
 			dqopt->flags |=
 				dquot_state_flag(DQUOT_SUSPENDED, cnt);
+			spin_unlock(&dq_state_lock);
 		} else {
+			spin_lock(&dq_state_lock);
 			dqopt->flags &= ~dquot_state_flag(flags, cnt);
 			/* Turning off suspended quotas? */
 			if (!sb_has_quota_loaded(sb, cnt) &&
 			    sb_has_quota_suspended(sb, cnt)) {
 				dqopt->flags &=	~dquot_state_flag(
 							DQUOT_SUSPENDED, cnt);
+				spin_unlock(&dq_state_lock);
 				iput(dqopt->files[cnt]);
 				dqopt->files[cnt] = NULL;
 				continue;
 			}
+			spin_unlock(&dq_state_lock);
 		}
 
 		/* We still have to keep quota loaded? */
@@ -1830,7 +1856,9 @@ static int vfs_load_quota_inode(struct inode *inode, int type, int format_id,
 	}
 	mutex_unlock(&dqopt->dqio_mutex);
 	mutex_unlock(&inode->i_mutex);
+	spin_lock(&dq_state_lock);
 	dqopt->flags |= dquot_state_flag(flags, type);
+	spin_unlock(&dq_state_lock);
 
 	add_dquot_ref(sb, type);
 	mutex_unlock(&dqopt->dqonoff_mutex);
@@ -1872,9 +1900,11 @@ static int vfs_quota_on_remount(struct super_block *sb, int type)
 	}
 	inode = dqopt->files[type];
 	dqopt->files[type] = NULL;
+	spin_lock(&dq_state_lock);
 	flags = dqopt->flags & dquot_state_flag(DQUOT_USAGE_ENABLED |
 						DQUOT_LIMITS_ENABLED, type);
 	dqopt->flags &= ~dquot_state_flag(DQUOT_STATE_FLAGS, type);
+	spin_unlock(&dq_state_lock);
 	mutex_unlock(&dqopt->dqonoff_mutex);
 
 	flags = dquot_generic_flag(flags, type);
@@ -1952,7 +1982,9 @@ int vfs_quota_enable(struct inode *inode, int type, int format_id,
 			ret = -EBUSY;
 			goto out_lock;
 		}
+		spin_lock(&dq_state_lock);
 		sb_dqopt(sb)->flags |= dquot_state_flag(flags, type);
+		spin_unlock(&dq_state_lock);
 out_lock:
 		mutex_unlock(&dqopt->dqonoff_mutex);
 		return ret;
@@ -2039,14 +2071,12 @@ int vfs_get_dqblk(struct super_block *sb, int type, qid_t id, struct if_dqblk *d
 {
 	struct dquot *dquot;
 
-	mutex_lock(&sb_dqopt(sb)->dqonoff_mutex);
-	if (!(dquot = dqget(sb, id, type))) {
-		mutex_unlock(&sb_dqopt(sb)->dqonoff_mutex);
+	dquot = dqget(sb, id, type);
+	if (dquot == NODQUOT)
 		return -ESRCH;
-	}
 	do_get_dqblk(dquot, di);
 	dqput(dquot);
-	mutex_unlock(&sb_dqopt(sb)->dqonoff_mutex);
+
 	return 0;
 }
 
@@ -2130,7 +2160,6 @@ int vfs_set_dqblk(struct super_block *sb, int type, qid_t id, struct if_dqblk *d
 	struct dquot *dquot;
 	int rc;
 
-	mutex_lock(&sb_dqopt(sb)->dqonoff_mutex);
 	dquot = dqget(sb, id, type);
 	if (!dquot) {
 		rc = -ESRCH;
@@ -2139,7 +2168,6 @@ int vfs_set_dqblk(struct super_block *sb, int type, qid_t id, struct if_dqblk *d
 	rc = do_set_dqblk(dquot, di);
 	dqput(dquot);
 out:
-	mutex_unlock(&sb_dqopt(sb)->dqonoff_mutex);
 	return rc;
 }
 
@@ -2370,11 +2398,9 @@ EXPORT_SYMBOL(dquot_release);
 EXPORT_SYMBOL(dquot_mark_dquot_dirty);
 EXPORT_SYMBOL(dquot_initialize);
 EXPORT_SYMBOL(dquot_drop);
-EXPORT_SYMBOL(dquot_drop_locked);
 EXPORT_SYMBOL(vfs_dq_drop);
 EXPORT_SYMBOL(dqget);
 EXPORT_SYMBOL(dqput);
-EXPORT_SYMBOL(dquot_is_cached);
 EXPORT_SYMBOL(dquot_alloc_space);
 EXPORT_SYMBOL(dquot_alloc_inode);
 EXPORT_SYMBOL(dquot_free_space);
diff --git a/include/linux/quotaops.h b/include/linux/quotaops.h
index 21b781a..0b35b3a 100644
--- a/include/linux/quotaops.h
+++ b/include/linux/quotaops.h
@@ -24,10 +24,8 @@ void sync_dquots(struct super_block *sb, int type);
 
 int dquot_initialize(struct inode *inode, int type);
 int dquot_drop(struct inode *inode);
-int dquot_drop_locked(struct inode *inode);
 struct dquot *dqget(struct super_block *sb, unsigned int id, int type);
 void dqput(struct dquot *dquot);
-int dquot_is_cached(struct super_block *sb, unsigned int id, int type);
 int dquot_scan_active(struct super_block *sb,
 		      int (*fn)(struct dquot *dquot, unsigned long priv),
 		      unsigned long priv);
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Ocfs2-devel] [PATCH 2/5] ocfs2: Remove ocfs2_dquot_initialize() and ocfs2_dquot_drop()
  2009-01-12 22:20 ` [Ocfs2-devel] [PATCH 1/5] quota: Improve locking Jan Kara
@ 2009-01-12 22:20   ` Jan Kara
  2009-01-12 22:20     ` [Ocfs2-devel] [PATCH 3/5] ocfs2: Push out dropping of dentry lock to ocfs2_wq Jan Kara
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Kara @ 2009-01-12 22:20 UTC (permalink / raw)
  To: ocfs2-devel

Since ->acquire_dquot and ->release_dquot callbacks aren't called under
dqptr_sem anymore, we don't have to start a transaction and obtain locks
so early. So we can just remove all this complicated stuff.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/ocfs2/quota_global.c |  169 +----------------------------------------------
 1 files changed, 2 insertions(+), 167 deletions(-)

diff --git a/fs/ocfs2/quota_global.c b/fs/ocfs2/quota_global.c
index 6aff8f2..f4efa89 100644
--- a/fs/ocfs2/quota_global.c
+++ b/fs/ocfs2/quota_global.c
@@ -810,171 +810,6 @@ out:
 	return status;
 }
 
-/* This is difficult. We have to lock quota inode and start transaction
- * in this function but we don't want to take the penalty of exlusive
- * quota file lock when we are just going to use cached structures. So
- * we just take read lock check whether we have dquot cached and if so,
- * we don't have to take the write lock... */
-static int ocfs2_dquot_initialize(struct inode *inode, int type)
-{
-	handle_t *handle = NULL;
-	int status = 0;
-	struct super_block *sb = inode->i_sb;
-	struct ocfs2_mem_dqinfo *oinfo;
-	int exclusive = 0;
-	int cnt;
-	qid_t id;
-
-	mlog_entry_void();
-
-	for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
-		if (type != -1 && cnt != type)
-			continue;
-		if (!sb_has_quota_active(sb, cnt))
-			continue;
-		oinfo = sb_dqinfo(sb, cnt)->dqi_priv;
-		status = ocfs2_lock_global_qf(oinfo, 0);
-		if (status < 0)
-			goto out;
-		/* This is just a performance optimization not a reliable test.
-		 * Since we hold an inode lock, noone can actually release
-		 * the structure until we are finished with initialization. */
-		if (inode->i_dquot[cnt] != NODQUOT) {
-			ocfs2_unlock_global_qf(oinfo, 0);
-			continue;
-		}
-		/* When we have inode lock, we know that no dquot_release() can
-		 * run and thus we can safely check whether we need to
-		 * read+modify global file to get quota information or whether
-		 * our node already has it. */
-		if (cnt == USRQUOTA)
-			id = inode->i_uid;
-		else if (cnt == GRPQUOTA)
-			id = inode->i_gid;
-		else
-			BUG();
-		/* Obtain exclusion from quota off... */
-		down_write(&sb_dqopt(sb)->dqptr_sem);
-		exclusive = !dquot_is_cached(sb, id, cnt);
-		up_write(&sb_dqopt(sb)->dqptr_sem);
-		if (exclusive) {
-			status = ocfs2_lock_global_qf(oinfo, 1);
-			if (status < 0) {
-				exclusive = 0;
-				mlog_errno(status);
-				goto out_ilock;
-			}
-			handle = ocfs2_start_trans(OCFS2_SB(sb),
-					ocfs2_calc_qinit_credits(sb, cnt));
-			if (IS_ERR(handle)) {
-				status = PTR_ERR(handle);
-				mlog_errno(status);
-				goto out_ilock;
-			}
-		}
-		dquot_initialize(inode, cnt);
-		if (exclusive) {
-			ocfs2_commit_trans(OCFS2_SB(sb), handle);
-			ocfs2_unlock_global_qf(oinfo, 1);
-		}
-		ocfs2_unlock_global_qf(oinfo, 0);
-	}
-	mlog_exit(0);
-	return 0;
-out_ilock:
-	if (exclusive)
-		ocfs2_unlock_global_qf(oinfo, 1);
-	ocfs2_unlock_global_qf(oinfo, 0);
-out:
-	mlog_exit(status);
-	return status;
-}
-
-static int ocfs2_dquot_drop_slow(struct inode *inode)
-{
-	int status = 0;
-	int cnt;
-	int got_lock[MAXQUOTAS] = {0, 0};
-	handle_t *handle;
-	struct super_block *sb = inode->i_sb;
-	struct ocfs2_mem_dqinfo *oinfo;
-
-	for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
-		if (!sb_has_quota_active(sb, cnt))
-			continue;
-		oinfo = sb_dqinfo(sb, cnt)->dqi_priv;
-		status = ocfs2_lock_global_qf(oinfo, 1);
-		if (status < 0)
-			goto out;
-		got_lock[cnt] = 1;
-	}
-	handle = ocfs2_start_trans(OCFS2_SB(sb),
-			ocfs2_calc_qinit_credits(sb, USRQUOTA) +
-			ocfs2_calc_qinit_credits(sb, GRPQUOTA));
-	if (IS_ERR(handle)) {
-		status = PTR_ERR(handle);
-		mlog_errno(status);
-		goto out;
-	}
-	dquot_drop(inode);
-	ocfs2_commit_trans(OCFS2_SB(sb), handle);
-out:
-	for (cnt = 0; cnt < MAXQUOTAS; cnt++)
-		if (got_lock[cnt]) {
-			oinfo = sb_dqinfo(sb, cnt)->dqi_priv;
-			ocfs2_unlock_global_qf(oinfo, 1);
-		}
-	return status;
-}
-
-/* See the comment before ocfs2_dquot_initialize. */
-static int ocfs2_dquot_drop(struct inode *inode)
-{
-	int status = 0;
-	struct super_block *sb = inode->i_sb;
-	struct ocfs2_mem_dqinfo *oinfo;
-	int exclusive = 0;
-	int cnt;
-	int got_lock[MAXQUOTAS] = {0, 0};
-
-	mlog_entry_void();
-	for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
-		if (!sb_has_quota_active(sb, cnt))
-			continue;
-		oinfo = sb_dqinfo(sb, cnt)->dqi_priv;
-		status = ocfs2_lock_global_qf(oinfo, 0);
-		if (status < 0)
-			goto out;
-		got_lock[cnt] = 1;
-	}
-	/* Lock against anyone releasing references so that when when we check
-	 * we know we are not going to be last ones to release dquot */
-	down_write(&sb_dqopt(sb)->dqptr_sem);
-	/* Urgh, this is a terrible hack :( */
-	for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
-		if (inode->i_dquot[cnt] != NODQUOT &&
-		    atomic_read(&inode->i_dquot[cnt]->dq_count) > 1) {
-			exclusive = 1;
-			break;
-		}
-	}
-	if (!exclusive)
-		dquot_drop_locked(inode);
-	up_write(&sb_dqopt(sb)->dqptr_sem);
-out:
-	for (cnt = 0; cnt < MAXQUOTAS; cnt++)
-		if (got_lock[cnt]) {
-			oinfo = sb_dqinfo(sb, cnt)->dqi_priv;
-			ocfs2_unlock_global_qf(oinfo, 0);
-		}
-	/* In case we bailed out because we had to do expensive locking
-	 * do it now... */
-	if (exclusive)
-		status = ocfs2_dquot_drop_slow(inode);
-	mlog_exit(status);
-	return status;
-}
-
 static struct dquot *ocfs2_alloc_dquot(struct super_block *sb, int type)
 {
 	struct ocfs2_dquot *dquot =
@@ -991,8 +826,8 @@ static void ocfs2_destroy_dquot(struct dquot *dquot)
 }
 
 struct dquot_operations ocfs2_quota_operations = {
-	.initialize	= ocfs2_dquot_initialize,
-	.drop		= ocfs2_dquot_drop,
+	.initialize	= dquot_initialize,
+	.drop		= dquot_drop,
 	.alloc_space	= dquot_alloc_space,
 	.alloc_inode	= dquot_alloc_inode,
 	.free_space	= dquot_free_space,
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Ocfs2-devel] [PATCH 3/5] ocfs2: Push out dropping of dentry lock to ocfs2_wq
  2009-01-12 22:20   ` [Ocfs2-devel] [PATCH 2/5] ocfs2: Remove ocfs2_dquot_initialize() and ocfs2_dquot_drop() Jan Kara
@ 2009-01-12 22:20     ` Jan Kara
  2009-01-12 22:20       ` [Ocfs2-devel] [PATCH 4/5] ocfs2: Fix possible deadlock in ocfs2_write_dquot() Jan Kara
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Kara @ 2009-01-12 22:20 UTC (permalink / raw)
  To: ocfs2-devel

Dropping of last reference to dentry lock is a complicated operation involving
dropping of reference to inode. This can get complicated and quota code in
particular needs to obtain some quota locks which leads to potential deadlock.
Thus we defer dropping of inode reference to ocfs2_wq.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/ocfs2/dcache.c |   42 +++++++++++++++++++++++++++++++++++++++---
 fs/ocfs2/dcache.h |    9 ++++++++-
 fs/ocfs2/ocfs2.h  |    6 ++++++
 fs/ocfs2/super.c  |    3 +++
 4 files changed, 56 insertions(+), 4 deletions(-)

diff --git a/fs/ocfs2/dcache.c b/fs/ocfs2/dcache.c
index b1cc7c3..e9d7c20 100644
--- a/fs/ocfs2/dcache.c
+++ b/fs/ocfs2/dcache.c
@@ -38,6 +38,7 @@
 #include "dlmglue.h"
 #include "file.h"
 #include "inode.h"
+#include "super.h"
 
 
 static int ocfs2_dentry_revalidate(struct dentry *dentry,
@@ -294,6 +295,34 @@ out_attach:
 	return ret;
 }
 
+static DEFINE_SPINLOCK(dentry_list_lock);
+
+/* We limit the number of dentry locks to drop in one go. We have
+ * this limit so that we don't starve other users of ocfs2_wq. */
+#define DL_INODE_DROP_COUNT 64
+
+/* Drop inode references from dentry locks */
+void ocfs2_drop_dl_inodes(struct work_struct *work)
+{
+	struct ocfs2_super *osb = container_of(work, struct ocfs2_super,
+					       dentry_lock_work);
+	struct ocfs2_dentry_lock *dl;
+	int drop_count = DL_INODE_DROP_COUNT;
+
+	spin_lock(&dentry_list_lock);
+	while (osb->dentry_lock_list && drop_count--) {
+		dl = osb->dentry_lock_list;
+		osb->dentry_lock_list = dl->dl_next;
+		spin_unlock(&dentry_list_lock);
+		iput(dl->dl_inode);
+		kfree(dl);
+		spin_lock(&dentry_list_lock);
+	}
+	if (osb->dentry_lock_list)
+		queue_work(ocfs2_wq, &osb->dentry_lock_work);
+	spin_unlock(&dentry_list_lock);
+}
+
 /*
  * ocfs2_dentry_iput() and friends.
  *
@@ -318,16 +347,23 @@ out_attach:
 static void ocfs2_drop_dentry_lock(struct ocfs2_super *osb,
 				   struct ocfs2_dentry_lock *dl)
 {
-	iput(dl->dl_inode);
 	ocfs2_simple_drop_lockres(osb, &dl->dl_lockres);
 	ocfs2_lock_res_free(&dl->dl_lockres);
-	kfree(dl);
+
+	/* We leave dropping of inode reference to ocfs2_wq as that can
+	 * possibly lead to inode deletion which gets tricky */
+	spin_lock(&dentry_list_lock);
+	if (!osb->dentry_lock_list)
+		queue_work(ocfs2_wq, &osb->dentry_lock_work);
+	dl->dl_next = osb->dentry_lock_list;
+	osb->dentry_lock_list = dl;
+	spin_unlock(&dentry_list_lock);
 }
 
 void ocfs2_dentry_lock_put(struct ocfs2_super *osb,
 			   struct ocfs2_dentry_lock *dl)
 {
-	int unlock = 0;
+	int unlock;
 
 	BUG_ON(dl->dl_count == 0);
 
diff --git a/fs/ocfs2/dcache.h b/fs/ocfs2/dcache.h
index c091c34..d06e16c 100644
--- a/fs/ocfs2/dcache.h
+++ b/fs/ocfs2/dcache.h
@@ -29,8 +29,13 @@
 extern struct dentry_operations ocfs2_dentry_ops;
 
 struct ocfs2_dentry_lock {
+	/* Use count of dentry lock */
 	unsigned int		dl_count;
-	u64			dl_parent_blkno;
+	union {
+		/* Linked list of dentry locks to release */
+		struct ocfs2_dentry_lock *dl_next;
+		u64			dl_parent_blkno;
+	};
 
 	/*
 	 * The ocfs2_dentry_lock keeps an inode reference until
@@ -47,6 +52,8 @@ int ocfs2_dentry_attach_lock(struct dentry *dentry, struct inode *inode,
 void ocfs2_dentry_lock_put(struct ocfs2_super *osb,
 			   struct ocfs2_dentry_lock *dl);
 
+void ocfs2_drop_dl_inodes(struct work_struct *work);
+
 struct dentry *ocfs2_find_local_alias(struct inode *inode, u64 parent_blkno,
 				      int skip_unhashed);
 
diff --git a/fs/ocfs2/ocfs2.h b/fs/ocfs2/ocfs2.h
index ad5c24a..0773841 100644
--- a/fs/ocfs2/ocfs2.h
+++ b/fs/ocfs2/ocfs2.h
@@ -210,6 +210,7 @@ struct ocfs2_journal;
 struct ocfs2_slot_info;
 struct ocfs2_recovery_map;
 struct ocfs2_quota_recovery;
+struct ocfs2_dentry_lock;
 struct ocfs2_super
 {
 	struct task_struct *commit_task;
@@ -325,6 +326,11 @@ struct ocfs2_super
 	struct list_head blocked_lock_list;
 	unsigned long blocked_lock_count;
 
+	/* List of dentry locks to release. Anyone can add locks to
+	 * the list, ocfs2_wq processes the list  */
+	struct ocfs2_dentry_lock *dentry_lock_list;
+	struct work_struct dentry_lock_work;
+
 	wait_queue_head_t		osb_mount_event;
 
 	/* Truncate log info */
diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
index 43ed113..b1cb38f 100644
--- a/fs/ocfs2/super.c
+++ b/fs/ocfs2/super.c
@@ -1887,6 +1887,9 @@ static int ocfs2_initialize_super(struct super_block *sb,
 	INIT_WORK(&journal->j_recovery_work, ocfs2_complete_recovery);
 	journal->j_state = OCFS2_JOURNAL_FREE;
 
+	INIT_WORK(&osb->dentry_lock_work, ocfs2_drop_dl_inodes);
+	osb->dentry_lock_list = NULL;
+
 	/* get some pseudo constants for clustersize bits */
 	osb->s_clustersize_bits =
 		le32_to_cpu(di->id2.i_super.s_clustersize_bits);
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Ocfs2-devel] [PATCH 4/5] ocfs2: Fix possible deadlock in ocfs2_write_dquot()
  2009-01-12 22:20     ` [Ocfs2-devel] [PATCH 3/5] ocfs2: Push out dropping of dentry lock to ocfs2_wq Jan Kara
@ 2009-01-12 22:20       ` Jan Kara
  2009-01-12 22:20         ` [Ocfs2-devel] [PATCH 5/5] ocfs2: Remove bogus link count decrement Jan Kara
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Kara @ 2009-01-12 22:20 UTC (permalink / raw)
  To: ocfs2-devel

It could happen that some limit has been set via quotactl() and in parallel
->mark_dirty() is called from another thread doing e.g. dquot_alloc_space(). In
such case ocfs2_write_dquot() must not try to sync the dquot because that needs
global quota lock but that ranks above transaction start.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/ocfs2/quota_global.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/fs/ocfs2/quota_global.c b/fs/ocfs2/quota_global.c
index f4efa89..1ed0f7c 100644
--- a/fs/ocfs2/quota_global.c
+++ b/fs/ocfs2/quota_global.c
@@ -754,7 +754,9 @@ static int ocfs2_mark_dquot_dirty(struct dquot *dquot)
 	if (dquot->dq_flags & mask)
 		sync = 1;
 	spin_unlock(&dq_data_lock);
-	if (!sync) {
+	/* This is a slight hack but we can't afford getting global quota
+	 * lock if we already have a transaction started. */
+	if (!sync || journal_current_handle()) {
 		status = ocfs2_write_dquot(dquot);
 		goto out;
 	}
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Ocfs2-devel] [PATCH 5/5] ocfs2: Remove bogus link count decrement
  2009-01-12 22:20       ` [Ocfs2-devel] [PATCH 4/5] ocfs2: Fix possible deadlock in ocfs2_write_dquot() Jan Kara
@ 2009-01-12 22:20         ` Jan Kara
  2009-01-14  5:56           ` tristan.ye
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Kara @ 2009-01-12 22:20 UTC (permalink / raw)
  To: ocfs2-devel

In ocfs2_unlink() we decremented i_nlink three times for directories (instead
of just two times). This actually did not matter because we have
ocfs2_drop_inode() which decides whether the inode should be deleted based on a
flag but it's at least surprising for a link count to wrap...

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/ocfs2/namei.c |    2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/fs/ocfs2/namei.c b/fs/ocfs2/namei.c
index 084aba8..892d5e7 100644
--- a/fs/ocfs2/namei.c
+++ b/fs/ocfs2/namei.c
@@ -888,8 +888,6 @@ static int ocfs2_unlink(struct inode *dir,
 	}
 
 	dir->i_ctime = dir->i_mtime = CURRENT_TIME;
-	if (S_ISDIR(inode->i_mode))
-		drop_nlink(dir);
 
 	status = ocfs2_mark_inode_dirty(handle, dir, parent_node_bh);
 	if (status < 0) {
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Ocfs2-devel] [PATCH 0/5] OCFS2 quota fixes
  2009-01-12 22:20 [Ocfs2-devel] [PATCH 0/5] OCFS2 quota fixes Jan Kara
  2009-01-12 22:20 ` [Ocfs2-devel] [PATCH 1/5] quota: Improve locking Jan Kara
@ 2009-01-13  6:01 ` tristan.ye
  2009-01-13 11:19   ` Jan Kara
  2009-01-14  5:48 ` tristan.ye
  2009-01-15  2:19 ` Mark Fasheh
  3 siblings, 1 reply; 14+ messages in thread
From: tristan.ye @ 2009-01-13  6:01 UTC (permalink / raw)
  To: ocfs2-devel

Jan Kara wrote:
> Hello,
>
> the following series of patches fixes some issues with OCFS2 quotas.
> The first patch modifies VFS quota locking, the next patch uses the
> fact to simplify OCFS2 quota locking and solves a few deadlock issues.
> The third and the fourth patches fix another possible deadlocks in OCFS2
> quota code and the last patch is a minor cleanup.
>   
Hi Jan,

Would you like to show me the kernel tree/branch where your patches're 
going to be applied?

Btw, I'm also wondering if your latest patches which resolves some 
deadlock issues can also be applicable for bug 1043&1067(deadlock on 
inode removal) on bugzilla:
http://oss.oracle.com/bugzilla/show_bug.cgi?id=1043

At last, It will be greatly appreciated if you can offer me your newly 
released quota-tools for OCFS2. Just simply attach the src tarball in 
replying mail is OK, or let me know the URL where your quota-tools for 
OCFS2 released:-)

Thanks in advance,

Tristan
> 								Honza
>
>
>
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-devel
>   

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Ocfs2-devel] [PATCH 0/5] OCFS2 quota fixes
  2009-01-13  6:01 ` [Ocfs2-devel] [PATCH 0/5] OCFS2 quota fixes tristan.ye
@ 2009-01-13 11:19   ` Jan Kara
  2009-01-14  1:51     ` tristan.ye
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Kara @ 2009-01-13 11:19 UTC (permalink / raw)
  To: ocfs2-devel

  Hello,

On Tue 13-01-09 14:01:04, tristan.ye wrote:
> Jan Kara wrote:
>> the following series of patches fixes some issues with OCFS2 quotas.
>> The first patch modifies VFS quota locking, the next patch uses the
>> fact to simplify OCFS2 quota locking and solves a few deadlock issues.
>> The third and the fourth patches fix another possible deadlocks in OCFS2
>> quota code and the last patch is a minor cleanup.
>>   
> Hi Jan,
>
> Would you like to show me the kernel tree/branch where your patches're 
> going to be applied?
  I've generated the patches against a master branch of Mark's git as of
yesterday. But they aren't very intrusive so I guess they should apply also
to other branches just fine.

> Btw, I'm also wondering if your latest patches which resolves some deadlock 
> issues can also be applicable for bug 1043&1067(deadlock on inode removal) 
> on bugzilla:
> http://oss.oracle.com/bugzilla/show_bug.cgi?id=1043
  Yes, they're exactly meant to solve these issues. I'm not able to trigger
the deadlock with these patches anymore. I was to tired to update the
bugzilla yesterday evening, I wanted to do it when Mark takes the patches.

> At last, It will be greatly appreciated if you can offer me your newly 
> released quota-tools for OCFS2. Just simply attach the src tarball in 
> replying mail is OK, or let me know the URL where your quota-tools for 
> OCFS2 released:-)
  You can download the source tarball from quota-tools Sourceforge pages:
http://sourceforge.net/project/showfiles.php?group_id=18136&package_id=13522

									Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Ocfs2-devel] [PATCH 0/5] OCFS2 quota fixes
  2009-01-13 11:19   ` Jan Kara
@ 2009-01-14  1:51     ` tristan.ye
  2009-01-14 11:24       ` Jan Kara
  0 siblings, 1 reply; 14+ messages in thread
From: tristan.ye @ 2009-01-14  1:51 UTC (permalink / raw)
  To: ocfs2-devel

On Tue, 2009-01-13 at 12:19 +0100, Jan Kara wrote:
> Hello,
> 
> On Tue 13-01-09 14:01:04, tristan.ye wrote:
> > Jan Kara wrote:
> >> the following series of patches fixes some issues with OCFS2 quotas.
> >> The first patch modifies VFS quota locking, the next patch uses the
> >> fact to simplify OCFS2 quota locking and solves a few deadlock issues.
> >> The third and the fourth patches fix another possible deadlocks in OCFS2
> >> quota code and the last patch is a minor cleanup.
> >>   
> > Hi Jan,
> >
> > Would you like to show me the kernel tree/branch where your patches're 
> > going to be applied?
>   I've generated the patches against a master branch of Mark's git as of
> yesterday. But they aren't very intrusive so I guess they should apply also
> to other branches just fine.
> 
> > Btw, I'm also wondering if your latest patches which resolves some deadlock 
> > issues can also be applicable for bug 1043&1067(deadlock on inode removal) 
> > on bugzilla:
> > http://oss.oracle.com/bugzilla/show_bug.cgi?id=1043
>   Yes, they're exactly meant to solve these issues. I'm not able to trigger
> the deadlock with these patches anymore. I was to tired to update the
> bugzilla yesterday evening, I wanted to do it when Mark takes the patches.
> 
> > At last, It will be greatly appreciated if you can offer me your newly 
> > released quota-tools for OCFS2. Just simply attach the src tarball in 
> > replying mail is OK, or let me know the URL where your quota-tools for 
> > OCFS2 released:-)
>   You can download the source tarball from quota-tools Sourceforge pages:
> http://sourceforge.net/project/showfiles.php?group_id=18136&package_id=13522

The quota-tools above was not ocfs2-specific?  it's generic for both
ext3 and ocfs2? right?

Thanks,

Tristan

> 
> 									Honza

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Ocfs2-devel] [PATCH 0/5] OCFS2 quota fixes
  2009-01-12 22:20 [Ocfs2-devel] [PATCH 0/5] OCFS2 quota fixes Jan Kara
  2009-01-12 22:20 ` [Ocfs2-devel] [PATCH 1/5] quota: Improve locking Jan Kara
  2009-01-13  6:01 ` [Ocfs2-devel] [PATCH 0/5] OCFS2 quota fixes tristan.ye
@ 2009-01-14  5:48 ` tristan.ye
  2009-01-15  2:19 ` Mark Fasheh
  3 siblings, 0 replies; 14+ messages in thread
From: tristan.ye @ 2009-01-14  5:48 UTC (permalink / raw)
  To: ocfs2-devel

On Mon, 2009-01-12 at 23:20 +0100, Jan Kara wrote:
> Hello,
> 
> the following series of patches fixes some issues with OCFS2 quotas.
> The first patch modifies VFS quota locking, the next patch uses the
> fact to simplify OCFS2 quota locking and solves a few deadlock issues.

Seems this series of patches did not fix the deadlock issue completely,
they do resolve the bug 1043, but still did not fix bug 1067, and it
even brought us a new regression bug,see my comments in bug 1043,1067
and in your 5th patch:-)

Thanks,

Tristan

> The third and the fourth patches fix another possible deadlocks in OCFS2
> quota code and the last patch is a minor cleanup.
> 
> 								Honza
> 
> 
> 
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Ocfs2-devel] [PATCH 5/5] ocfs2: Remove bogus link count decrement
  2009-01-12 22:20         ` [Ocfs2-devel] [PATCH 5/5] ocfs2: Remove bogus link count decrement Jan Kara
@ 2009-01-14  5:56           ` tristan.ye
  2009-01-14 11:19             ` Jan Kara
  0 siblings, 1 reply; 14+ messages in thread
From: tristan.ye @ 2009-01-14  5:56 UTC (permalink / raw)
  To: ocfs2-devel

On Mon, 2009-01-12 at 23:20 +0100, Jan Kara wrote:
> In ocfs2_unlink() we decremented i_nlink three times for directories (instead
> of just two times). This actually did not matter because we have
> ocfs2_drop_inode() which decides whether the inode should be deleted based on a
> flag but it's at least surprising for a link count to wrap...
> 
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
>  fs/ocfs2/namei.c |    2 --
>  1 files changed, 0 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/ocfs2/namei.c b/fs/ocfs2/namei.c
> index 084aba8..892d5e7 100644
> --- a/fs/ocfs2/namei.c
> +++ b/fs/ocfs2/namei.c
> @@ -888,8 +888,6 @@ static int ocfs2_unlink(struct inode *dir,
>  	}
>  
>  	dir->i_ctime = dir->i_mtime = CURRENT_TIME;
> -	if (S_ISDIR(inode->i_mode))
> -		drop_nlink(dir);

I thought the logic of original ocfs2_unlink() has no problem, it seems
just drop the link count 2 times to release reference count for  '.' and
'..' in all for the deleting directory inode.

The code you've truncated is the logic to drop link count for its parent
directory, and it's a REQUIRED action, you may see it as
drop_nlink(inode) by mistake:-) ? 

Otherwise, VFS unlink() will hit a failure when deleting a dir which
containes sub_dir
entires.  Like following,

1. mkdir a

2. mkdir -p a/b

3. rm -rf a
rm: cannot remove directory `a/': Directory not empty


Tristan 

>  
>  	status = ocfs2_mark_inode_dirty(handle, dir, parent_node_bh);
>  	if (status < 0) {

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Ocfs2-devel] [PATCH 5/5] ocfs2: Remove bogus link count decrement
  2009-01-14  5:56           ` tristan.ye
@ 2009-01-14 11:19             ` Jan Kara
  0 siblings, 0 replies; 14+ messages in thread
From: Jan Kara @ 2009-01-14 11:19 UTC (permalink / raw)
  To: ocfs2-devel

On Wed 14-01-09 13:56:55, tristan.ye wrote:
> On Mon, 2009-01-12 at 23:20 +0100, Jan Kara wrote:
> > In ocfs2_unlink() we decremented i_nlink three times for directories (instead
> > of just two times). This actually did not matter because we have
> > ocfs2_drop_inode() which decides whether the inode should be deleted based on a
> > flag but it's at least surprising for a link count to wrap...
> > 
> > Signed-off-by: Jan Kara <jack@suse.cz>
> > ---
> >  fs/ocfs2/namei.c |    2 --
> >  1 files changed, 0 insertions(+), 2 deletions(-)
> > 
> > diff --git a/fs/ocfs2/namei.c b/fs/ocfs2/namei.c
> > index 084aba8..892d5e7 100644
> > --- a/fs/ocfs2/namei.c
> > +++ b/fs/ocfs2/namei.c
> > @@ -888,8 +888,6 @@ static int ocfs2_unlink(struct inode *dir,
> >  	}
> >  
> >  	dir->i_ctime = dir->i_mtime = CURRENT_TIME;
> > -	if (S_ISDIR(inode->i_mode))
> > -		drop_nlink(dir);
> 
> I thought the logic of original ocfs2_unlink() has no problem, it seems
> just drop the link count 2 times to release reference count for  '.' and
> '..' in all for the deleting directory inode.
> 
> The code you've truncated is the logic to drop link count for its parent
> directory, and it's a REQUIRED action, you may see it as
> drop_nlink(inode) by mistake:-) ? 
  Yes, I've read the code wrong. Thanks for spotting this.

> Otherwise, VFS unlink() will hit a failure when deleting a dir which
> containes sub_dir entires.  Like following,
> 
> 1. mkdir a
> 
> 2. mkdir -p a/b
> 
> 3. rm -rf a
> rm: cannot remove directory `a/': Directory not empty
  Hmm, I was just testing on one level of directories so my testing didn't
reveal the problem. Sigh.

									Honza

-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Ocfs2-devel] [PATCH 0/5] OCFS2 quota fixes
  2009-01-14  1:51     ` tristan.ye
@ 2009-01-14 11:24       ` Jan Kara
  0 siblings, 0 replies; 14+ messages in thread
From: Jan Kara @ 2009-01-14 11:24 UTC (permalink / raw)
  To: ocfs2-devel

On Wed 14-01-09 09:51:04, tristan.ye wrote:
> On Tue, 2009-01-13 at 12:19 +0100, Jan Kara wrote:
> > Hello,
> > 
> > On Tue 13-01-09 14:01:04, tristan.ye wrote:
> > > Jan Kara wrote:
> > >> the following series of patches fixes some issues with OCFS2 quotas.
> > >> The first patch modifies VFS quota locking, the next patch uses the
> > >> fact to simplify OCFS2 quota locking and solves a few deadlock issues.
> > >> The third and the fourth patches fix another possible deadlocks in OCFS2
> > >> quota code and the last patch is a minor cleanup.
> > >>   
> > > Hi Jan,
> > >
> > > Would you like to show me the kernel tree/branch where your patches're 
> > > going to be applied?
> >   I've generated the patches against a master branch of Mark's git as of
> > yesterday. But they aren't very intrusive so I guess they should apply also
> > to other branches just fine.
> > 
> > > Btw, I'm also wondering if your latest patches which resolves some deadlock 
> > > issues can also be applicable for bug 1043&1067(deadlock on inode removal) 
> > > on bugzilla:
> > > http://oss.oracle.com/bugzilla/show_bug.cgi?id=1043
> >   Yes, they're exactly meant to solve these issues. I'm not able to trigger
> > the deadlock with these patches anymore. I was to tired to update the
> > bugzilla yesterday evening, I wanted to do it when Mark takes the patches.
> > 
> > > At last, It will be greatly appreciated if you can offer me your newly 
> > > released quota-tools for OCFS2. Just simply attach the src tarball in 
> > > replying mail is OK, or let me know the URL where your quota-tools for 
> > > OCFS2 released:-)
> >   You can download the source tarball from quota-tools Sourceforge pages:
> > http://sourceforge.net/project/showfiles.php?group_id=18136&package_id=13522
> 
> The quota-tools above was not ocfs2-specific?  it's generic for both
> ext3 and ocfs2? right?
  Yes. I've merged changes needed for OCFS2 to general quota tools.

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Ocfs2-devel] [PATCH 0/5] OCFS2 quota fixes
  2009-01-12 22:20 [Ocfs2-devel] [PATCH 0/5] OCFS2 quota fixes Jan Kara
                   ` (2 preceding siblings ...)
  2009-01-14  5:48 ` tristan.ye
@ 2009-01-15  2:19 ` Mark Fasheh
  3 siblings, 0 replies; 14+ messages in thread
From: Mark Fasheh @ 2009-01-15  2:19 UTC (permalink / raw)
  To: ocfs2-devel

On Mon, Jan 12, 2009 at 11:20:28PM +0100, Jan Kara wrote:
> Hello,
> 
> the following series of patches fixes some issues with OCFS2 quotas.
> The first patch modifies VFS quota locking, the next patch uses the
> fact to simplify OCFS2 quota locking and solves a few deadlock issues.
> The third and the fourth patches fix another possible deadlocks in OCFS2
> quota code and the last patch is a minor cleanup.
Ok, patches 1-4 are in Ocfs2.git. I think we should send them upstream
during this cycle. Do you want to post these on linux-fsdevel too?
        --Mark        


--
Mark Fasheh

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2009-01-15  2:19 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-01-12 22:20 [Ocfs2-devel] [PATCH 0/5] OCFS2 quota fixes Jan Kara
2009-01-12 22:20 ` [Ocfs2-devel] [PATCH 1/5] quota: Improve locking Jan Kara
2009-01-12 22:20   ` [Ocfs2-devel] [PATCH 2/5] ocfs2: Remove ocfs2_dquot_initialize() and ocfs2_dquot_drop() Jan Kara
2009-01-12 22:20     ` [Ocfs2-devel] [PATCH 3/5] ocfs2: Push out dropping of dentry lock to ocfs2_wq Jan Kara
2009-01-12 22:20       ` [Ocfs2-devel] [PATCH 4/5] ocfs2: Fix possible deadlock in ocfs2_write_dquot() Jan Kara
2009-01-12 22:20         ` [Ocfs2-devel] [PATCH 5/5] ocfs2: Remove bogus link count decrement Jan Kara
2009-01-14  5:56           ` tristan.ye
2009-01-14 11:19             ` Jan Kara
2009-01-13  6:01 ` [Ocfs2-devel] [PATCH 0/5] OCFS2 quota fixes tristan.ye
2009-01-13 11:19   ` Jan Kara
2009-01-14  1:51     ` tristan.ye
2009-01-14 11:24       ` Jan Kara
2009-01-14  5:48 ` tristan.ye
2009-01-15  2:19 ` Mark Fasheh

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.