[Cluster-devel] GFS2: Pre-pull patch posting

cluster-devel.redhat.com archive mirror
 help / color / mirror / Atom feed

* [Cluster-devel] GFS2: Pre-pull patch posting
@ 2010-05-17 12:40 Steven Whitehouse
  2010-05-17 12:40 ` [Cluster-devel] [PATCH 01/11] GFS2: Remove space from slab cache name Steven Whitehouse
  0 siblings, 1 reply; 27+ messages in thread
From: Steven Whitehouse @ 2010-05-17 12:40 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

Nothing very exciting this time.... mostly minor bug fixes and
a docs update. The gfs2_logd patch has been hanging around for
a long time and is now finally integrated. It is the first step
towards a longer term goal of improving performance in that
area,

Steve.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Cluster-devel] [PATCH 01/11] GFS2: Remove space from slab cache name
  2010-05-17 12:40 [Cluster-devel] GFS2: Pre-pull patch posting Steven Whitehouse
@ 2010-05-17 12:40 ` Steven Whitehouse
  2010-05-17 12:40   ` [Cluster-devel] [PATCH 02/11] GFS2: docs update Steven Whitehouse
  0 siblings, 1 reply; 27+ messages in thread
From: Steven Whitehouse @ 2010-05-17 12:40 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Apparently this might confuse parsers.

Reported-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
---
 fs/gfs2/main.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/gfs2/main.c b/fs/gfs2/main.c
index a88fadc..fb2a5f9 100644
--- a/fs/gfs2/main.c
+++ b/fs/gfs2/main.c
@@ -94,7 +94,7 @@ static int __init init_gfs2_fs(void)
 	if (!gfs2_glock_cachep)
 		goto fail;
 
-	gfs2_glock_aspace_cachep = kmem_cache_create("gfs2_glock (aspace)",
+	gfs2_glock_aspace_cachep = kmem_cache_create("gfs2_glock(aspace)",
 					sizeof(struct gfs2_glock) +
 					sizeof(struct address_space),
 					0, 0, gfs2_init_gl_aspace_once);
-- 
1.6.2.5



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Cluster-devel] [PATCH 02/11] GFS2: docs update
  2010-05-17 12:40 ` [Cluster-devel] [PATCH 01/11] GFS2: Remove space from slab cache name Steven Whitehouse
@ 2010-05-17 12:40   ` Steven Whitehouse
  2010-05-17 12:40     ` [Cluster-devel] [PATCH 03/11] GFS2: Clean up stuffed file copying Steven Whitehouse
  0 siblings, 1 reply; 27+ messages in thread
From: Steven Whitehouse @ 2010-05-17 12:40 UTC (permalink / raw)
  To: cluster-devel.redhat.com

From: Andrea Gelmini <andrea.gelmini@gelma.net>

Now http://sources.redhat.com/cluster/ is redirected to
http://sources.redhat.com/cluster/wiki/

Also fixed tabs in the end.

Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
---
 Documentation/filesystems/gfs2.txt |   12 ++++++------
 1 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/Documentation/filesystems/gfs2.txt b/Documentation/filesystems/gfs2.txt
index 5e3ab8f..0b59c02 100644
--- a/Documentation/filesystems/gfs2.txt
+++ b/Documentation/filesystems/gfs2.txt
@@ -1,7 +1,7 @@
 Global File System
 ------------------
 
-http://sources.redhat.com/cluster/
+http://sources.redhat.com/cluster/wiki/
 
 GFS is a cluster file system. It allows a cluster of computers to
 simultaneously use a block device that is shared between them (with FC,
@@ -36,11 +36,11 @@ GFS2 is not on-disk compatible with previous versions of GFS, but it
 is pretty close.
 
 The following man pages can be found at the URL above:
-  fsck.gfs2	to repair a filesystem
-  gfs2_grow	to expand a filesystem online
-  gfs2_jadd	to add journals to a filesystem online
-  gfs2_tool	to manipulate, examine and tune a filesystem
+  fsck.gfs2		to repair a filesystem
+  gfs2_grow		to expand a filesystem online
+  gfs2_jadd		to add journals to a filesystem online
+  gfs2_tool		to manipulate, examine and tune a filesystem
   gfs2_quota	to examine and change quota values in a filesystem
   gfs2_convert	to convert a gfs filesystem to gfs2 in-place
   mount.gfs2	to help mount(8) mount a filesystem
-  mkfs.gfs2	to make a filesystem
+  mkfs.gfs2		to make a filesystem
-- 
1.6.2.5



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Cluster-devel] [PATCH 03/11] GFS2: Clean up stuffed file copying
  2010-05-17 12:40   ` [Cluster-devel] [PATCH 02/11] GFS2: docs update Steven Whitehouse
@ 2010-05-17 12:40     ` Steven Whitehouse
  2010-05-17 12:40       ` [Cluster-devel] [PATCH 04/11] GFS2: glock livelock Steven Whitehouse
  0 siblings, 1 reply; 27+ messages in thread
From: Steven Whitehouse @ 2010-05-17 12:40 UTC (permalink / raw)
  To: cluster-devel.redhat.com

If the inode size was corrupt for stuffed files, it was possible
for the copying of data to overrun the block and/or page. This patch
checks for that condition so that this is no longer possible.

This is also preparation for the new truncate sequence patch which
requires the ability to have stuffed files with larger sizes than
(disk block size - sizeof(on disk inode)) with the restriction that
only the initial part of the file may be non-zero.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
---
 fs/gfs2/aops.c |    8 +++++---
 fs/gfs2/bmap.c |   17 ++++++++++-------
 2 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c
index 0c1d0b8..a739a0a 100644
--- a/fs/gfs2/aops.c
+++ b/fs/gfs2/aops.c
@@ -418,6 +418,7 @@ static int gfs2_jdata_writepages(struct address_space *mapping,
 static int stuffed_readpage(struct gfs2_inode *ip, struct page *page)
 {
 	struct buffer_head *dibh;
+	u64 dsize = i_size_read(&ip->i_inode);
 	void *kaddr;
 	int error;
 
@@ -437,9 +438,10 @@ static int stuffed_readpage(struct gfs2_inode *ip, struct page *page)
 		return error;
 
 	kaddr = kmap_atomic(page, KM_USER0);
-	memcpy(kaddr, dibh->b_data + sizeof(struct gfs2_dinode),
-	       ip->i_disksize);
-	memset(kaddr + ip->i_disksize, 0, PAGE_CACHE_SIZE - ip->i_disksize);
+	if (dsize > (dibh->b_size - sizeof(struct gfs2_dinode)))
+		dsize = (dibh->b_size - sizeof(struct gfs2_dinode));
+	memcpy(kaddr, dibh->b_data + sizeof(struct gfs2_dinode), dsize);
+	memset(kaddr + dsize, 0, PAGE_CACHE_SIZE - dsize);
 	kunmap_atomic(kaddr, KM_USER0);
 	flush_dcache_page(page);
 	brelse(dibh);
diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
index 583e823..0db0cd9 100644
--- a/fs/gfs2/bmap.c
+++ b/fs/gfs2/bmap.c
@@ -72,11 +72,13 @@ static int gfs2_unstuffer_page(struct gfs2_inode *ip, struct buffer_head *dibh,
 
 	if (!PageUptodate(page)) {
 		void *kaddr = kmap(page);
+		u64 dsize = i_size_read(inode);
+ 
+		if (dsize > (dibh->b_size - sizeof(struct gfs2_dinode)))
+			dsize = dibh->b_size - sizeof(struct gfs2_dinode);
 
-		memcpy(kaddr, dibh->b_data + sizeof(struct gfs2_dinode),
-		       ip->i_disksize);
-		memset(kaddr + ip->i_disksize, 0,
-		       PAGE_CACHE_SIZE - ip->i_disksize);
+		memcpy(kaddr, dibh->b_data + sizeof(struct gfs2_dinode), dsize);
+		memset(kaddr + dsize, 0, PAGE_CACHE_SIZE - dsize);
 		kunmap(page);
 
 		SetPageUptodate(page);
@@ -1039,13 +1041,14 @@ static int trunc_start(struct gfs2_inode *ip, u64 size)
 		goto out;
 
 	if (gfs2_is_stuffed(ip)) {
-		ip->i_disksize = size;
+		u64 dsize = size + sizeof(struct gfs2_inode);
 		ip->i_inode.i_mtime = ip->i_inode.i_ctime = CURRENT_TIME;
 		gfs2_trans_add_bh(ip->i_gl, dibh, 1);
 		gfs2_dinode_out(ip, dibh->b_data);
-		gfs2_buffer_clear_tail(dibh, sizeof(struct gfs2_dinode) + size);
+		if (dsize > dibh->b_size)
+			dsize = dibh->b_size;
+		gfs2_buffer_clear_tail(dibh, dsize);
 		error = 1;
-
 	} else {
 		if (size & (u64)(sdp->sd_sb.sb_bsize - 1))
 			error = gfs2_block_truncate_page(ip->i_inode.i_mapping);
-- 
1.6.2.5



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Cluster-devel] [PATCH 04/11] GFS2: glock livelock
  2010-05-17 12:40     ` [Cluster-devel] [PATCH 03/11] GFS2: Clean up stuffed file copying Steven Whitehouse
@ 2010-05-17 12:40       ` Steven Whitehouse
  2010-05-17 12:40         ` [Cluster-devel] [PATCH 05/11] GFS2: Various gfs2_logd improvements Steven Whitehouse
  0 siblings, 1 reply; 27+ messages in thread
From: Steven Whitehouse @ 2010-05-17 12:40 UTC (permalink / raw)
  To: cluster-devel.redhat.com

From: Bob Peterson <rpeterso@redhat.com>

This patch fixes a couple gfs2 problems with the reclaiming of
unlinked dinodes.  First, there were a couple of livelocks where
everything would come to a halt waiting for a glock that was
seemingly held by a process that no longer existed.  In fact, the
process did exist, it just had the wrong pid number in the holder
information.  Second, there was a lock ordering problem between
inode locking and glock locking.  Third, glock/inode contention
could sometimes cause inodes to be improperly marked invalid by
iget_failed.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
 fs/gfs2/dir.c        |    2 +-
 fs/gfs2/export.c     |    2 +-
 fs/gfs2/glock.c      |    3 +
 fs/gfs2/inode.c      |  101 +++++++++++++++++++++++++++++++++++++++++++++----
 fs/gfs2/inode.h      |    5 +-
 fs/gfs2/ops_fstype.c |    2 +-
 fs/gfs2/rgrp.c       |   58 +++++++++++++++++++++-------
 7 files changed, 144 insertions(+), 29 deletions(-)

diff --git a/fs/gfs2/dir.c b/fs/gfs2/dir.c
index 25fddc1..8295c5b 100644
--- a/fs/gfs2/dir.c
+++ b/fs/gfs2/dir.c
@@ -1475,7 +1475,7 @@ struct inode *gfs2_dir_search(struct inode *dir, const struct qstr *name)
 		inode = gfs2_inode_lookup(dir->i_sb, 
 				be16_to_cpu(dent->de_type),
 				be64_to_cpu(dent->de_inum.no_addr),
-				be64_to_cpu(dent->de_inum.no_formal_ino), 0);
+				be64_to_cpu(dent->de_inum.no_formal_ino));
 		brelse(bh);
 		return inode;
 	}
diff --git a/fs/gfs2/export.c b/fs/gfs2/export.c
index d15876e..d81bc7e 100644
--- a/fs/gfs2/export.c
+++ b/fs/gfs2/export.c
@@ -169,7 +169,7 @@ static struct dentry *gfs2_get_dentry(struct super_block *sb,
 	if (error)
 		goto fail;
 
-	inode = gfs2_inode_lookup(sb, DT_UNKNOWN, inum->no_addr, 0, 0);
+	inode = gfs2_inode_lookup(sb, DT_UNKNOWN, inum->no_addr, 0);
 	if (IS_ERR(inode)) {
 		error = PTR_ERR(inode);
 		goto fail;
diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index 454d4b4..ddcdbf4 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -855,6 +855,9 @@ void gfs2_holder_reinit(unsigned int state, unsigned flags, struct gfs2_holder *
 	gh->gh_flags = flags;
 	gh->gh_iflags = 0;
 	gh->gh_ip = (unsigned long)__builtin_return_address(0);
+	if (gh->gh_owner_pid)
+		put_pid(gh->gh_owner_pid);
+	gh->gh_owner_pid = get_pid(task_pid(current));
 }
 
 /**
diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index b1bf269..51d8061 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -158,7 +158,6 @@ void gfs2_set_iop(struct inode *inode)
  * @sb: The super block
  * @no_addr: The inode number
  * @type: The type of the inode
- * @skip_freeing: set this not return an inode if it is currently being freed.
  *
  * Returns: A VFS inode, or an error
  */
@@ -166,17 +165,14 @@ void gfs2_set_iop(struct inode *inode)
 struct inode *gfs2_inode_lookup(struct super_block *sb,
 				unsigned int type,
 				u64 no_addr,
-				u64 no_formal_ino, int skip_freeing)
+				u64 no_formal_ino)
 {
 	struct inode *inode;
 	struct gfs2_inode *ip;
 	struct gfs2_glock *io_gl;
 	int error;
 
-	if (skip_freeing)
-		inode = gfs2_iget_skip(sb, no_addr);
-	else
-		inode = gfs2_iget(sb, no_addr);
+	inode = gfs2_iget(sb, no_addr);
 	ip = GFS2_I(inode);
 
 	if (!inode)
@@ -234,13 +230,100 @@ fail_glock:
 fail_iopen:
 	gfs2_glock_put(io_gl);
 fail_put:
-	ip->i_gl->gl_object = NULL;
+	if (inode->i_state & I_NEW)
+		ip->i_gl->gl_object = NULL;
 	gfs2_glock_put(ip->i_gl);
 fail:
-	iget_failed(inode);
+	if (inode->i_state & I_NEW)
+		iget_failed(inode);
+	else
+		iput(inode);
 	return ERR_PTR(error);
 }
 
+/**
+ * gfs2_unlinked_inode_lookup - Lookup an unlinked inode for reclamation
+ * @sb: The super block
+ * no_addr: The inode number
+ * @@inode: A pointer to the inode found, if any
+ *
+ * Returns: 0 and *inode if no errors occurred.  If an error occurs,
+ *          the resulting *inode may or may not be NULL.
+ */
+
+int gfs2_unlinked_inode_lookup(struct super_block *sb, u64 no_addr,
+			       struct inode **inode)
+{
+	struct gfs2_sbd *sdp;
+	struct gfs2_inode *ip;
+	struct gfs2_glock *io_gl;
+	int error;
+	struct gfs2_holder gh;
+
+	*inode = gfs2_iget_skip(sb, no_addr);
+
+	if (!(*inode))
+		return -ENOBUFS;
+
+	if (!((*inode)->i_state & I_NEW))
+		return -ENOBUFS;
+
+	ip = GFS2_I(*inode);
+	sdp = GFS2_SB(*inode);
+	ip->i_no_formal_ino = -1;
+
+	error = gfs2_glock_get(sdp, no_addr, &gfs2_inode_glops, CREATE, &ip->i_gl);
+	if (unlikely(error))
+		goto fail;
+	ip->i_gl->gl_object = ip;
+
+	error = gfs2_glock_get(sdp, no_addr, &gfs2_iopen_glops, CREATE, &io_gl);
+	if (unlikely(error))
+		goto fail_put;
+
+	set_bit(GIF_INVALID, &ip->i_flags);
+	error = gfs2_glock_nq_init(io_gl, LM_ST_SHARED, LM_FLAG_TRY | GL_EXACT,
+				   &ip->i_iopen_gh);
+	if (unlikely(error)) {
+		if (error == GLR_TRYFAILED)
+			error = 0;
+		goto fail_iopen;
+	}
+	ip->i_iopen_gh.gh_gl->gl_object = ip;
+	gfs2_glock_put(io_gl);
+
+	(*inode)->i_mode = DT2IF(DT_UNKNOWN);
+
+	/*
+	 * We must read the inode in order to work out its type in
+	 * this case. Note that this doesn't happen often as we normally
+	 * know the type beforehand. This code path only occurs during
+	 * unlinked inode recovery (where it is safe to do this glock,
+	 * which is not true in the general case).
+	 */
+	error = gfs2_glock_nq_init(ip->i_gl, LM_ST_EXCLUSIVE, LM_FLAG_TRY,
+				   &gh);
+	if (unlikely(error)) {
+		if (error == GLR_TRYFAILED)
+			error = 0;
+		goto fail_glock;
+	}
+	/* Inode is now uptodate */
+	gfs2_glock_dq_uninit(&gh);
+	gfs2_set_iop(*inode);
+
+	return 0;
+fail_glock:
+	gfs2_glock_dq(&ip->i_iopen_gh);
+fail_iopen:
+	gfs2_glock_put(io_gl);
+fail_put:
+	ip->i_gl->gl_object = NULL;
+	gfs2_glock_put(ip->i_gl);
+fail:
+	return error;
+}
+
 static int gfs2_dinode_in(struct gfs2_inode *ip, const void *buf)
 {
 	const struct gfs2_dinode *str = buf;
@@ -862,7 +945,7 @@ struct inode *gfs2_createi(struct gfs2_holder *ghs, const struct qstr *name,
 		goto fail_gunlock2;
 
 	inode = gfs2_inode_lookup(dir->i_sb, IF2DT(mode), inum.no_addr,
-				  inum.no_formal_ino, 0);
+				  inum.no_formal_ino);
 	if (IS_ERR(inode))
 		goto fail_gunlock2;
 
diff --git a/fs/gfs2/inode.h b/fs/gfs2/inode.h
index c341aaf..e161461 100644
--- a/fs/gfs2/inode.h
+++ b/fs/gfs2/inode.h
@@ -83,8 +83,9 @@ static inline void gfs2_inum_out(const struct gfs2_inode *ip,
 
 extern void gfs2_set_iop(struct inode *inode);
 extern struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned type, 
-				       u64 no_addr, u64 no_formal_ino,
-				       int skip_freeing);
+				       u64 no_addr, u64 no_formal_ino);
+extern int gfs2_unlinked_inode_lookup(struct super_block *sb, u64 no_addr,
+				      struct inode **inode);
 extern struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr);
 
 extern int gfs2_inode_refresh(struct gfs2_inode *ip);
diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index c1309ed..dc35f34 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -487,7 +487,7 @@ static int gfs2_lookup_root(struct super_block *sb, struct dentry **dptr,
 	struct dentry *dentry;
 	struct inode *inode;
 
-	inode = gfs2_inode_lookup(sb, DT_DIR, no_addr, 0, 0);
+	inode = gfs2_inode_lookup(sb, DT_DIR, no_addr, 0);
 	if (IS_ERR(inode)) {
 		fs_err(sdp, "can't read in %s inode: %ld\n", name, PTR_ERR(inode));
 		return PTR_ERR(inode);
diff --git a/fs/gfs2/rgrp.c b/fs/gfs2/rgrp.c
index 503b842..3739155 100644
--- a/fs/gfs2/rgrp.c
+++ b/fs/gfs2/rgrp.c
@@ -948,18 +948,20 @@ static int try_rgrp_fit(struct gfs2_rgrpd *rgd, struct gfs2_alloc *al)
  * try_rgrp_unlink - Look for any unlinked, allocated, but unused inodes
  * @rgd: The rgrp
  *
- * Returns: The inode, if one has been found
+ * Returns: 0 if no error
+ *          The inode, if one has been found, in inode.
  */
 
-static struct inode *try_rgrp_unlink(struct gfs2_rgrpd *rgd, u64 *last_unlinked,
-				     u64 skip)
+static int try_rgrp_unlink(struct gfs2_rgrpd *rgd, u64 *last_unlinked,
+			   u64 skip, struct inode **inode)
 {
-	struct inode *inode;
 	u32 goal = 0, block;
 	u64 no_addr;
 	struct gfs2_sbd *sdp = rgd->rd_sbd;
 	unsigned int n;
+	int error = 0;
 
+	*inode = NULL;
 	for(;;) {
 		if (goal >= rgd->rd_data)
 			break;
@@ -979,14 +981,14 @@ static struct inode *try_rgrp_unlink(struct gfs2_rgrpd *rgd, u64 *last_unlinked,
 		if (no_addr == skip)
 			continue;
 		*last_unlinked = no_addr;
-		inode = gfs2_inode_lookup(rgd->rd_sbd->sd_vfs, DT_UNKNOWN,
-					  no_addr, -1, 1);
-		if (!IS_ERR(inode))
-			return inode;
+		error = gfs2_unlinked_inode_lookup(rgd->rd_sbd->sd_vfs,
+						   no_addr, inode);
+		if (*inode || error)
+			return error;
 	}
 
 	rgd->rd_flags &= ~GFS2_RDF_CHECK;
-	return NULL;
+	return 0;
 }
 
 /**
@@ -1096,12 +1098,27 @@ static struct inode *get_local_rgrp(struct gfs2_inode *ip, u64 *last_unlinked)
 		case 0:
 			if (try_rgrp_fit(rgd, al))
 				goto out;
-			if (rgd->rd_flags & GFS2_RDF_CHECK)
-				inode = try_rgrp_unlink(rgd, last_unlinked, ip->i_no_addr);
+			/* If the rg came in already locked, there's no
+			   way we can recover from a failed try_rgrp_unlink
+			   because that would require an iput which can only
+			   happen after the rgrp is unlocked. */
+			if (!rg_locked && rgd->rd_flags & GFS2_RDF_CHECK)
+				error = try_rgrp_unlink(rgd, last_unlinked,
+							ip->i_no_addr, &inode);
 			if (!rg_locked)
 				gfs2_glock_dq_uninit(&al->al_rgd_gh);
-			if (inode)
+			if (inode) {
+				if (error) {
+					if (inode->i_state & I_NEW)
+						iget_failed(inode);
+					else
+						iput(inode);
+					return ERR_PTR(error);
+				}
 				return inode;
+			}
+			if (error)
+				return ERR_PTR(error);
 			/* fall through */
 		case GLR_TRYFAILED:
 			rgd = recent_rgrp_next(rgd);
@@ -1130,12 +1147,23 @@ static struct inode *get_local_rgrp(struct gfs2_inode *ip, u64 *last_unlinked)
 		case 0:
 			if (try_rgrp_fit(rgd, al))
 				goto out;
-			if (rgd->rd_flags & GFS2_RDF_CHECK)
-				inode = try_rgrp_unlink(rgd, last_unlinked, ip->i_no_addr);
+			if (!rg_locked && rgd->rd_flags & GFS2_RDF_CHECK)
+				error = try_rgrp_unlink(rgd, last_unlinked,
+							ip->i_no_addr, &inode);
 			if (!rg_locked)
 				gfs2_glock_dq_uninit(&al->al_rgd_gh);
-			if (inode)
+			if (inode) {
+				if (error) {
+					if (inode->i_state & I_NEW)
+						iget_failed(inode);
+					else
+						iput(inode);
+					return ERR_PTR(error);
+				}
 				return inode;
+			}
+			if (error)
+				return ERR_PTR(error);
 			break;
 
 		case GLR_TRYFAILED:
-- 
1.6.2.5



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Cluster-devel] [PATCH 05/11] GFS2: Various gfs2_logd improvements
  2010-05-17 12:40       ` [Cluster-devel] [PATCH 04/11] GFS2: glock livelock Steven Whitehouse
@ 2010-05-17 12:40         ` Steven Whitehouse
  2010-05-17 12:40           ` [Cluster-devel] [PATCH 06/11] GFS2: fix quota state reporting Steven Whitehouse
  0 siblings, 1 reply; 27+ messages in thread
From: Steven Whitehouse @ 2010-05-17 12:40 UTC (permalink / raw)
  To: cluster-devel.redhat.com

From: Benjamin Marzinski <bmarzins@redhat.com>

This patch contains various tweaks to how log flushes and active item writeback
work. gfs2_logd is now managed by a waitqueue, and gfs2_log_reseve now waits
for gfs2_logd to do the log flushing.  Multiple functions were rewritten to
remove the need to call gfs2_log_lock(). Instead of using one test to see if
gfs2_logd had work to do, there are now seperate tests to check if there
are two many buffers in the incore log or if there are two many items on the
active items list.

This patch is a port of a patch Steve Whitehouse wrote about a year ago, with
some minor changes.  Since gfs2_ail1_start always submits all the active items,
it no longer needs to keep track of the first ai submitted, so this has been
removed. In gfs2_log_reserve(), the order of the calls to
prepare_to_wait_exclusive() and wake_up() when firing off the logd thread has
been switched.  If it called wake_up first there was a small window for a race,
where logd could run and return before gfs2_log_reserve was ready to get woken
up. If gfs2_logd ran, but did not free up enough blocks, gfs2_log_reserve()
would be left waiting for gfs2_logd to eventualy run because it timed out.
Finally, gt_logd_secs, which controls how long to wait before gfs2_logd times
out, and flushes the log, can now be set on mount with ar_commit.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
---
 fs/gfs2/incore.h     |   10 ++--
 fs/gfs2/log.c        |  157 ++++++++++++++++++++++++++++----------------------
 fs/gfs2/log.h        |    1 -
 fs/gfs2/lops.c       |    2 +
 fs/gfs2/meta_io.c    |    1 +
 fs/gfs2/ops_fstype.c |   17 +++---
 fs/gfs2/super.c      |    8 +-
 fs/gfs2/sys.c        |    4 -
 fs/gfs2/trans.c      |   18 ++++++
 9 files changed, 126 insertions(+), 92 deletions(-)

diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
index 3aac46f..08dd657 100644
--- a/fs/gfs2/incore.h
+++ b/fs/gfs2/incore.h
@@ -439,9 +439,6 @@ struct gfs2_args {
 struct gfs2_tune {
 	spinlock_t gt_spin;
 
-	unsigned int gt_incore_log_blocks;
-	unsigned int gt_log_flush_secs;
-
 	unsigned int gt_logd_secs;
 
 	unsigned int gt_quota_simul_sync; /* Max quotavals to sync at once */
@@ -618,6 +615,7 @@ struct gfs2_sbd {
 	unsigned int sd_log_commited_databuf;
 	int sd_log_commited_revoke;
 
+	atomic_t sd_log_pinned;
 	unsigned int sd_log_num_buf;
 	unsigned int sd_log_num_revoke;
 	unsigned int sd_log_num_rg;
@@ -629,15 +627,17 @@ struct gfs2_sbd {
 	struct list_head sd_log_le_databuf;
 	struct list_head sd_log_le_ordered;
 
+	atomic_t sd_log_thresh1;
+	atomic_t sd_log_thresh2;
 	atomic_t sd_log_blks_free;
-	struct mutex sd_log_reserve_mutex;
+	wait_queue_head_t sd_log_waitq;
+	wait_queue_head_t sd_logd_waitq;
 
 	u64 sd_log_sequence;
 	unsigned int sd_log_head;
 	unsigned int sd_log_tail;
 	int sd_log_idle;
 
-	unsigned long sd_log_flush_time;
 	struct rw_semaphore sd_log_flush_lock;
 	atomic_t sd_log_in_flight;
 	wait_queue_head_t sd_log_flush_wait;
diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index e5bf4b5..d5959df 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -168,12 +168,11 @@ static int gfs2_ail1_empty_one(struct gfs2_sbd *sdp, struct gfs2_ail *ai, int fl
 	return list_empty(&ai->ai_ail1_list);
 }
 
-static void gfs2_ail1_start(struct gfs2_sbd *sdp, int flags)
+static void gfs2_ail1_start(struct gfs2_sbd *sdp)
 {
 	struct list_head *head;
 	u64 sync_gen;
-	struct list_head *first;
-	struct gfs2_ail *first_ai, *ai, *tmp;
+	struct gfs2_ail *ai;
 	int done = 0;
 
 	gfs2_log_lock(sdp);
@@ -184,21 +183,9 @@ static void gfs2_ail1_start(struct gfs2_sbd *sdp, int flags)
 	}
 	sync_gen = sdp->sd_ail_sync_gen++;
 
-	first = head->prev;
-	first_ai = list_entry(first, struct gfs2_ail, ai_list);
-	first_ai->ai_sync_gen = sync_gen;
-	gfs2_ail1_start_one(sdp, first_ai); /* This may drop log lock */
-
-	if (flags & DIO_ALL)
-		first = NULL;
-
 	while(!done) {
-		if (first && (head->prev != first ||
-			      gfs2_ail1_empty_one(sdp, first_ai, 0)))
-			break;
-
 		done = 1;
-		list_for_each_entry_safe_reverse(ai, tmp, head, ai_list) {
+		list_for_each_entry_reverse(ai, head, ai_list) {
 			if (ai->ai_sync_gen >= sync_gen)
 				continue;
 			ai->ai_sync_gen = sync_gen;
@@ -290,58 +277,57 @@ static void ail2_empty(struct gfs2_sbd *sdp, unsigned int new_tail)
  * flush time, so we ensure that we have just enough free blocks at all
  * times to avoid running out during a log flush.
  *
+ * We no longer flush the log here, instead we wake up logd to do that
+ * for us. To avoid the thundering herd and to ensure that we deal fairly
+ * with queued waiters, we use an exclusive wait. This means that when we
+ * get woken with enough journal space to get our reservation, we need to
+ * wake the next waiter on the list.
+ *
  * Returns: errno
  */
 
 int gfs2_log_reserve(struct gfs2_sbd *sdp, unsigned int blks)
 {
-	unsigned int try = 0;
 	unsigned reserved_blks = 6 * (4096 / sdp->sd_vfs->s_blocksize);
+	unsigned wanted = blks + reserved_blks;
+	DEFINE_WAIT(wait);
+	int did_wait = 0;
+	unsigned int free_blocks;
 
 	if (gfs2_assert_warn(sdp, blks) ||
 	    gfs2_assert_warn(sdp, blks <= sdp->sd_jdesc->jd_blocks))
 		return -EINVAL;
-
-	mutex_lock(&sdp->sd_log_reserve_mutex);
-	gfs2_log_lock(sdp);
-	while(atomic_read(&sdp->sd_log_blks_free) <= (blks + reserved_blks)) {
-		gfs2_log_unlock(sdp);
-		gfs2_ail1_empty(sdp, 0);
-		gfs2_log_flush(sdp, NULL);
-
-		if (try++)
-			gfs2_ail1_start(sdp, 0);
-		gfs2_log_lock(sdp);
+retry:
+	free_blocks = atomic_read(&sdp->sd_log_blks_free);
+	if (unlikely(free_blocks <= wanted)) {
+		do {
+			prepare_to_wait_exclusive(&sdp->sd_log_waitq, &wait,
+					TASK_UNINTERRUPTIBLE);
+			wake_up(&sdp->sd_logd_waitq);
+			did_wait = 1;
+			if (atomic_read(&sdp->sd_log_blks_free) <= wanted)
+				io_schedule();
+			free_blocks = atomic_read(&sdp->sd_log_blks_free);
+		} while(free_blocks <= wanted);
+		finish_wait(&sdp->sd_log_waitq, &wait);
 	}
-	atomic_sub(blks, &sdp->sd_log_blks_free);
+	if (atomic_cmpxchg(&sdp->sd_log_blks_free, free_blocks,
+				free_blocks - blks) != free_blocks)
+		goto retry;
 	trace_gfs2_log_blocks(sdp, -blks);
-	gfs2_log_unlock(sdp);
-	mutex_unlock(&sdp->sd_log_reserve_mutex);
+
+	/*
+	 * If we waited, then so might others, wake them up _after_ we get
+	 * our share of the log.
+	 */
+	if (unlikely(did_wait))
+		wake_up(&sdp->sd_log_waitq);
 
 	down_read(&sdp->sd_log_flush_lock);
 
 	return 0;
 }
 
-/**
- * gfs2_log_release - Release a given number of log blocks
- * @sdp: The GFS2 superblock
- * @blks: The number of blocks
- *
- */
-
-void gfs2_log_release(struct gfs2_sbd *sdp, unsigned int blks)
-{
-
-	gfs2_log_lock(sdp);
-	atomic_add(blks, &sdp->sd_log_blks_free);
-	trace_gfs2_log_blocks(sdp, blks);
-	gfs2_assert_withdraw(sdp,
-			     atomic_read(&sdp->sd_log_blks_free) <= sdp->sd_jdesc->jd_blocks);
-	gfs2_log_unlock(sdp);
-	up_read(&sdp->sd_log_flush_lock);
-}
-
 static u64 log_bmap(struct gfs2_sbd *sdp, unsigned int lbn)
 {
 	struct gfs2_journal_extent *je;
@@ -559,11 +545,10 @@ static void log_pull_tail(struct gfs2_sbd *sdp, unsigned int new_tail)
 
 	ail2_empty(sdp, new_tail);
 
-	gfs2_log_lock(sdp);
 	atomic_add(dist, &sdp->sd_log_blks_free);
 	trace_gfs2_log_blocks(sdp, dist);
-	gfs2_assert_withdraw(sdp, atomic_read(&sdp->sd_log_blks_free) <= sdp->sd_jdesc->jd_blocks);
-	gfs2_log_unlock(sdp);
+	gfs2_assert_withdraw(sdp, atomic_read(&sdp->sd_log_blks_free) <=
+			     sdp->sd_jdesc->jd_blocks);
 
 	sdp->sd_log_tail = new_tail;
 }
@@ -822,6 +807,13 @@ static void buf_lo_incore_commit(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
  * @sdp: the filesystem
  * @tr: the transaction
  *
+ * We wake up gfs2_logd if the number of pinned blocks exceed thresh1
+ * or the total number of used blocks (pinned blocks plus AIL blocks)
+ * is greater than thresh2.
+ *
+ * At mount time thresh1 is 1/3rd of journal size, thresh2 is 2/3rd of
+ * journal size.
+ *
  * Returns: errno
  */
 
@@ -832,10 +824,10 @@ void gfs2_log_commit(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
 
 	up_read(&sdp->sd_log_flush_lock);
 
-	gfs2_log_lock(sdp);
-	if (sdp->sd_log_num_buf > gfs2_tune_get(sdp, gt_incore_log_blocks))
-		wake_up_process(sdp->sd_logd_process);
-	gfs2_log_unlock(sdp);
+	if (atomic_read(&sdp->sd_log_pinned) > atomic_read(&sdp->sd_log_thresh1) ||
+	    ((sdp->sd_jdesc->jd_blocks - atomic_read(&sdp->sd_log_blks_free)) >
+	    atomic_read(&sdp->sd_log_thresh2)))
+		wake_up(&sdp->sd_logd_waitq);
 }
 
 /**
@@ -882,13 +874,23 @@ void gfs2_meta_syncfs(struct gfs2_sbd *sdp)
 {
 	gfs2_log_flush(sdp, NULL);
 	for (;;) {
-		gfs2_ail1_start(sdp, DIO_ALL);
+		gfs2_ail1_start(sdp);
 		if (gfs2_ail1_empty(sdp, DIO_ALL))
 			break;
 		msleep(10);
 	}
 }
 
+static inline int gfs2_jrnl_flush_reqd(struct gfs2_sbd *sdp)
+{
+	return (atomic_read(&sdp->sd_log_pinned) >= atomic_read(&sdp->sd_log_thresh1));
+}
+
+static inline int gfs2_ail_flush_reqd(struct gfs2_sbd *sdp)
+{
+	unsigned int used_blocks = sdp->sd_jdesc->jd_blocks - atomic_read(&sdp->sd_log_blks_free);
+	return used_blocks >= atomic_read(&sdp->sd_log_thresh2);
+}
 
 /**
  * gfs2_logd - Update log tail as Active Items get flushed to in-place blocks
@@ -901,28 +903,43 @@ void gfs2_meta_syncfs(struct gfs2_sbd *sdp)
 int gfs2_logd(void *data)
 {
 	struct gfs2_sbd *sdp = data;
-	unsigned long t;
-	int need_flush;
+	unsigned long t = 1;
+	DEFINE_WAIT(wait);
+	unsigned preflush;
 
 	while (!kthread_should_stop()) {
-		/* Advance the log tail */
 
-		t = sdp->sd_log_flush_time +
-		    gfs2_tune_get(sdp, gt_log_flush_secs) * HZ;
+		preflush = atomic_read(&sdp->sd_log_pinned);
+		if (gfs2_jrnl_flush_reqd(sdp) || t == 0) {
+			gfs2_ail1_empty(sdp, DIO_ALL);
+			gfs2_log_flush(sdp, NULL);
+			gfs2_ail1_empty(sdp, DIO_ALL);
+		}
 
-		gfs2_ail1_empty(sdp, DIO_ALL);
-		gfs2_log_lock(sdp);
-		need_flush = sdp->sd_log_num_buf > gfs2_tune_get(sdp, gt_incore_log_blocks);
-		gfs2_log_unlock(sdp);
-		if (need_flush || time_after_eq(jiffies, t)) {
+		if (gfs2_ail_flush_reqd(sdp)) {
+			gfs2_ail1_start(sdp);
+			io_schedule();
+			gfs2_ail1_empty(sdp, 0);
 			gfs2_log_flush(sdp, NULL);
-			sdp->sd_log_flush_time = jiffies;
+			gfs2_ail1_empty(sdp, DIO_ALL);
 		}
 
+		wake_up(&sdp->sd_log_waitq);
 		t = gfs2_tune_get(sdp, gt_logd_secs) * HZ;
 		if (freezing(current))
 			refrigerator();
-		schedule_timeout_interruptible(t);
+
+		do {
+			prepare_to_wait(&sdp->sd_logd_waitq, &wait,
+					TASK_UNINTERRUPTIBLE);
+			if (!gfs2_ail_flush_reqd(sdp) &&
+			    !gfs2_jrnl_flush_reqd(sdp) &&
+			    !kthread_should_stop())
+				t = schedule_timeout(t);
+		} while(t && !gfs2_ail_flush_reqd(sdp) &&
+			!gfs2_jrnl_flush_reqd(sdp) &&
+			!kthread_should_stop());
+		finish_wait(&sdp->sd_logd_waitq, &wait);
 	}
 
 	return 0;
diff --git a/fs/gfs2/log.h b/fs/gfs2/log.h
index 7c64510..eb570b4 100644
--- a/fs/gfs2/log.h
+++ b/fs/gfs2/log.h
@@ -51,7 +51,6 @@ unsigned int gfs2_struct2blk(struct gfs2_sbd *sdp, unsigned int nstruct,
 			    unsigned int ssize);
 
 int gfs2_log_reserve(struct gfs2_sbd *sdp, unsigned int blks);
-void gfs2_log_release(struct gfs2_sbd *sdp, unsigned int blks);
 void gfs2_log_incr_head(struct gfs2_sbd *sdp);
 
 struct buffer_head *gfs2_log_get_buf(struct gfs2_sbd *sdp);
diff --git a/fs/gfs2/lops.c b/fs/gfs2/lops.c
index adc260f..bf33f82 100644
--- a/fs/gfs2/lops.c
+++ b/fs/gfs2/lops.c
@@ -54,6 +54,7 @@ static void gfs2_pin(struct gfs2_sbd *sdp, struct buffer_head *bh)
 	if (bd->bd_ail)
 		list_move(&bd->bd_ail_st_list, &bd->bd_ail->ai_ail2_list);
 	get_bh(bh);
+	atomic_inc(&sdp->sd_log_pinned);
 	trace_gfs2_pin(bd, 1);
 }
 
@@ -94,6 +95,7 @@ static void gfs2_unpin(struct gfs2_sbd *sdp, struct buffer_head *bh,
 	trace_gfs2_pin(bd, 0);
 	gfs2_log_unlock(sdp);
 	unlock_buffer(bh);
+	atomic_dec(&sdp->sd_log_pinned);
 }
 
 
diff --git a/fs/gfs2/meta_io.c b/fs/gfs2/meta_io.c
index 0bb12c8..abafda1 100644
--- a/fs/gfs2/meta_io.c
+++ b/fs/gfs2/meta_io.c
@@ -313,6 +313,7 @@ void gfs2_remove_from_journal(struct buffer_head *bh, struct gfs2_trans *tr, int
 	struct gfs2_bufdata *bd = bh->b_private;
 
 	if (test_clear_buffer_pinned(bh)) {
+		atomic_dec(&sdp->sd_log_pinned);
 		list_del_init(&bd->bd_le.le_list);
 		if (meta) {
 			gfs2_assert_warn(sdp, sdp->sd_log_num_buf);
diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index dc35f34..3593b3a 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -57,8 +57,6 @@ static void gfs2_tune_init(struct gfs2_tune *gt)
 {
 	spin_lock_init(&gt->gt_spin);
 
-	gt->gt_incore_log_blocks = 1024;
-	gt->gt_logd_secs = 1;
 	gt->gt_quota_simul_sync = 64;
 	gt->gt_quota_warn_period = 10;
 	gt->gt_quota_scale_num = 1;
@@ -101,14 +99,15 @@ static struct gfs2_sbd *init_sbd(struct super_block *sb)
 	spin_lock_init(&sdp->sd_trunc_lock);
 
 	spin_lock_init(&sdp->sd_log_lock);
-
+	atomic_set(&sdp->sd_log_pinned, 0);
 	INIT_LIST_HEAD(&sdp->sd_log_le_buf);
 	INIT_LIST_HEAD(&sdp->sd_log_le_revoke);
 	INIT_LIST_HEAD(&sdp->sd_log_le_rg);
 	INIT_LIST_HEAD(&sdp->sd_log_le_databuf);
 	INIT_LIST_HEAD(&sdp->sd_log_le_ordered);
 
-	mutex_init(&sdp->sd_log_reserve_mutex);
+	init_waitqueue_head(&sdp->sd_log_waitq);
+	init_waitqueue_head(&sdp->sd_logd_waitq);
 	INIT_LIST_HEAD(&sdp->sd_ail1_list);
 	INIT_LIST_HEAD(&sdp->sd_ail2_list);
 
@@ -733,6 +732,8 @@ static int init_journal(struct gfs2_sbd *sdp, int undo)
 	if (sdp->sd_args.ar_spectator) {
 		sdp->sd_jdesc = gfs2_jdesc_find(sdp, 0);
 		atomic_set(&sdp->sd_log_blks_free, sdp->sd_jdesc->jd_blocks);
+		atomic_set(&sdp->sd_log_thresh1, 2*sdp->sd_jdesc->jd_blocks/5);
+		atomic_set(&sdp->sd_log_thresh2, 4*sdp->sd_jdesc->jd_blocks/5);
 	} else {
 		if (sdp->sd_lockstruct.ls_jid >= gfs2_jindex_size(sdp)) {
 			fs_err(sdp, "can't mount journal #%u\n",
@@ -770,6 +771,8 @@ static int init_journal(struct gfs2_sbd *sdp, int undo)
 			goto fail_jinode_gh;
 		}
 		atomic_set(&sdp->sd_log_blks_free, sdp->sd_jdesc->jd_blocks);
+		atomic_set(&sdp->sd_log_thresh1, 2*sdp->sd_jdesc->jd_blocks/5);
+		atomic_set(&sdp->sd_log_thresh2, 4*sdp->sd_jdesc->jd_blocks/5);
 
 		/* Map the extents for this journal's blocks */
 		map_journal_extents(sdp);
@@ -951,8 +954,6 @@ static int init_threads(struct gfs2_sbd *sdp, int undo)
 	if (undo)
 		goto fail_quotad;
 
-	sdp->sd_log_flush_time = jiffies;
-
 	p = kthread_run(gfs2_logd, sdp, "gfs2_logd");
 	error = IS_ERR(p);
 	if (error) {
@@ -1160,7 +1161,7 @@ static int fill_super(struct super_block *sb, struct gfs2_args *args, int silent
                                GFS2_BASIC_BLOCK_SHIFT;
 	sdp->sd_fsb2bb = 1 << sdp->sd_fsb2bb_shift;
 
-	sdp->sd_tune.gt_log_flush_secs = sdp->sd_args.ar_commit;
+	sdp->sd_tune.gt_logd_secs = sdp->sd_args.ar_commit;
 	sdp->sd_tune.gt_quota_quantum = sdp->sd_args.ar_quota_quantum;
 	if (sdp->sd_args.ar_statfs_quantum) {
 		sdp->sd_tune.gt_statfs_slow = 0;
@@ -1323,7 +1324,7 @@ static int gfs2_get_sb(struct file_system_type *fs_type, int flags,
 	memset(&args, 0, sizeof(args));
 	args.ar_quota = GFS2_QUOTA_DEFAULT;
 	args.ar_data = GFS2_DATA_DEFAULT;
-	args.ar_commit = 60;
+	args.ar_commit = 30;
 	args.ar_statfs_quantum = 30;
 	args.ar_quota_quantum = 60;
 	args.ar_errors = GFS2_ERRORS_DEFAULT;
diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index 50aac60..7a93e9f 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -1113,7 +1113,7 @@ static int gfs2_remount_fs(struct super_block *sb, int *flags, char *data)
 	int error;
 
 	spin_lock(&gt->gt_spin);
-	args.ar_commit = gt->gt_log_flush_secs;
+	args.ar_commit = gt->gt_logd_secs;
 	args.ar_quota_quantum = gt->gt_quota_quantum;
 	if (gt->gt_statfs_slow)
 		args.ar_statfs_quantum = 0;
@@ -1160,7 +1160,7 @@ static int gfs2_remount_fs(struct super_block *sb, int *flags, char *data)
 	else
 		clear_bit(SDF_NOBARRIERS, &sdp->sd_flags);
 	spin_lock(&gt->gt_spin);
-	gt->gt_log_flush_secs = args.ar_commit;
+	gt->gt_logd_secs = args.ar_commit;
 	gt->gt_quota_quantum = args.ar_quota_quantum;
 	if (args.ar_statfs_quantum) {
 		gt->gt_statfs_slow = 0;
@@ -1305,8 +1305,8 @@ static int gfs2_show_options(struct seq_file *s, struct vfsmount *mnt)
 	}
 	if (args->ar_discard)
 		seq_printf(s, ",discard");
-	val = sdp->sd_tune.gt_log_flush_secs;
-	if (val != 60)
+	val = sdp->sd_tune.gt_logd_secs;
+	if (val != 30)
 		seq_printf(s, ",commit=%d", val);
 	val = sdp->sd_tune.gt_statfs_quantum;
 	if (val != 30)
diff --git a/fs/gfs2/sys.c b/fs/gfs2/sys.c
index 419042f..2ac845d 100644
--- a/fs/gfs2/sys.c
+++ b/fs/gfs2/sys.c
@@ -469,8 +469,6 @@ static ssize_t name##_store(struct gfs2_sbd *sdp, const char *buf, size_t len)\
 }                                                                             \
 TUNE_ATTR_2(name, name##_store)
 
-TUNE_ATTR(incore_log_blocks, 0);
-TUNE_ATTR(log_flush_secs, 0);
 TUNE_ATTR(quota_warn_period, 0);
 TUNE_ATTR(quota_quantum, 0);
 TUNE_ATTR(max_readahead, 0);
@@ -482,8 +480,6 @@ TUNE_ATTR(statfs_quantum, 1);
 TUNE_ATTR_3(quota_scale, quota_scale_show, quota_scale_store);
 
 static struct attribute *tune_attrs[] = {
-	&tune_attr_incore_log_blocks.attr,
-	&tune_attr_log_flush_secs.attr,
 	&tune_attr_quota_warn_period.attr,
 	&tune_attr_quota_quantum.attr,
 	&tune_attr_max_readahead.attr,
diff --git a/fs/gfs2/trans.c b/fs/gfs2/trans.c
index 4ef0e9f..9ec73a8 100644
--- a/fs/gfs2/trans.c
+++ b/fs/gfs2/trans.c
@@ -23,6 +23,7 @@
 #include "meta_io.h"
 #include "trans.h"
 #include "util.h"
+#include "trace_gfs2.h"
 
 int gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int blocks,
 		     unsigned int revokes)
@@ -75,6 +76,23 @@ fail_holder_uninit:
 	return error;
 }
 
+/**
+ * gfs2_log_release - Release a given number of log blocks
+ * @sdp: The GFS2 superblock
+ * @blks: The number of blocks
+ *
+ */
+
+static void gfs2_log_release(struct gfs2_sbd *sdp, unsigned int blks)
+{
+
+	atomic_add(blks, &sdp->sd_log_blks_free);
+	trace_gfs2_log_blocks(sdp, blks);
+	gfs2_assert_withdraw(sdp, atomic_read(&sdp->sd_log_blks_free) <=
+				  sdp->sd_jdesc->jd_blocks);
+	up_read(&sdp->sd_log_flush_lock);
+}
+
 void gfs2_trans_end(struct gfs2_sbd *sdp)
 {
 	struct gfs2_trans *tr = current->journal_info;
-- 
1.6.2.5



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Cluster-devel] [PATCH 06/11] GFS2: fix quota state reporting
  2010-05-17 12:40         ` [Cluster-devel] [PATCH 05/11] GFS2: Various gfs2_logd improvements Steven Whitehouse
@ 2010-05-17 12:40           ` Steven Whitehouse
  2010-05-17 12:40             ` [Cluster-devel] [PATCH 07/11] GFS2: Add some useful messages Steven Whitehouse
  0 siblings, 1 reply; 27+ messages in thread
From: Steven Whitehouse @ 2010-05-17 12:40 UTC (permalink / raw)
  To: cluster-devel.redhat.com

From: Christoph Hellwig <hch@lst.de>

We need to report both the accounting and enforcing flags if we are
in enforcing mode.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
---
 fs/gfs2/quota.c |   16 ++++++++++++----
 1 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/fs/gfs2/quota.c b/fs/gfs2/quota.c
index 6dbcbad..6ca0967 100644
--- a/fs/gfs2/quota.c
+++ b/fs/gfs2/quota.c
@@ -1418,10 +1418,18 @@ static int gfs2_quota_get_xstate(struct super_block *sb,
 
 	memset(fqs, 0, sizeof(struct fs_quota_stat));
 	fqs->qs_version = FS_QSTAT_VERSION;
-	if (sdp->sd_args.ar_quota == GFS2_QUOTA_ON)
-		fqs->qs_flags = (XFS_QUOTA_UDQ_ENFD | XFS_QUOTA_GDQ_ENFD);
-	else if (sdp->sd_args.ar_quota == GFS2_QUOTA_ACCOUNT)
-		fqs->qs_flags = (XFS_QUOTA_UDQ_ACCT | XFS_QUOTA_GDQ_ACCT);
+
+	switch (sdp->sd_args.ar_quota) {
+	case GFS2_QUOTA_ON:
+		fqs->qs_flags |= (XFS_QUOTA_UDQ_ENFD | XFS_QUOTA_GDQ_ENFD);
+		/*FALLTHRU*/
+	case GFS2_QUOTA_ACCOUNT:
+		fqs->qs_flags |= (XFS_QUOTA_UDQ_ACCT | XFS_QUOTA_GDQ_ACCT);
+		break;
+	case GFS2_QUOTA_OFF:
+		break;
+	}
+
 	if (sdp->sd_quota_inode) {
 		fqs->qs_uquota.qfs_ino = GFS2_I(sdp->sd_quota_inode)->i_no_addr;
 		fqs->qs_uquota.qfs_nblks = sdp->sd_quota_inode->i_blocks;
-- 
1.6.2.5



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Cluster-devel] [PATCH 07/11] GFS2: Add some useful messages
  2010-05-17 12:40           ` [Cluster-devel] [PATCH 06/11] GFS2: fix quota state reporting Steven Whitehouse
@ 2010-05-17 12:40             ` Steven Whitehouse
  2010-05-17 12:40               ` [Cluster-devel] [PATCH 08/11] GFS2: Fix writing to non-page aligned gfs2_quota structures Steven Whitehouse
  0 siblings, 1 reply; 27+ messages in thread
From: Steven Whitehouse @ 2010-05-17 12:40 UTC (permalink / raw)
  To: cluster-devel.redhat.com

The following patch adds a message to indicate when barriers have been
disabled due to a block device which doesn't support them. You could
already tell this via the mount options in /proc/mounts, but all the
other filesystems also log a message at the same time.

Also, the same mechanisms are used to indicate when the lock
demote interface has been used (only ever used for debugging)
which is a request from our support team.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
---
 fs/gfs2/incore.h |    1 +
 fs/gfs2/log.c    |    1 +
 fs/gfs2/super.c  |    3 ++-
 fs/gfs2/sys.c    |    2 ++
 4 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
index 08dd657..b5d7363 100644
--- a/fs/gfs2/incore.h
+++ b/fs/gfs2/incore.h
@@ -459,6 +459,7 @@ enum {
 	SDF_SHUTDOWN		= 2,
 	SDF_NOBARRIERS		= 3,
 	SDF_NORECOVERY		= 4,
+	SDF_DEMOTE		= 5,
 };
 
 #define GFS2_FSNAME_LEN		256
diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index d5959df..b593f0e 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -600,6 +600,7 @@ static void log_write_header(struct gfs2_sbd *sdp, u32 flags, int pull)
 	if (buffer_eopnotsupp(bh)) {
 		clear_buffer_eopnotsupp(bh);
 		set_buffer_uptodate(bh);
+		fs_info(sdp, "barrier sync failed - disabling barriers\n");
 		set_bit(SDF_NOBARRIERS, &sdp->sd_flags);
 		lock_buffer(bh);
 skip_barrier:
diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index 7a93e9f..4d1aad3 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -1334,7 +1334,8 @@ static int gfs2_show_options(struct seq_file *s, struct vfsmount *mnt)
 	}
 	if (test_bit(SDF_NOBARRIERS, &sdp->sd_flags))
 		seq_printf(s, ",nobarrier");
-
+	if (test_bit(SDF_DEMOTE, &sdp->sd_flags))
+		seq_printf(s, ",demote_interface_used");
 	return 0;
 }
 
diff --git a/fs/gfs2/sys.c b/fs/gfs2/sys.c
index 2ac845d..7afb62e 100644
--- a/fs/gfs2/sys.c
+++ b/fs/gfs2/sys.c
@@ -233,6 +233,8 @@ static ssize_t demote_rq_store(struct gfs2_sbd *sdp, const char *buf, size_t len
 	glops = gfs2_glops_list[gltype];
 	if (glops == NULL)
 		return -EINVAL;
+	if (test_and_set_bit(SDF_DEMOTE, &sdp->sd_flags))
+		fs_info(sdp, "demote interface used\n");
 	rv = gfs2_glock_get(sdp, glnum, glops, 0, &gl);
 	if (rv)
 		return rv;
-- 
1.6.2.5



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Cluster-devel] [PATCH 08/11] GFS2: Fix writing to non-page aligned gfs2_quota structures
  2010-05-17 12:40             ` [Cluster-devel] [PATCH 07/11] GFS2: Add some useful messages Steven Whitehouse
@ 2010-05-17 12:40               ` Steven Whitehouse
  2010-05-17 12:40                 ` [Cluster-devel] [PATCH 09/11] GFS2: Eliminate useless err variable Steven Whitehouse
  0 siblings, 1 reply; 27+ messages in thread
From: Steven Whitehouse @ 2010-05-17 12:40 UTC (permalink / raw)
  To: cluster-devel.redhat.com

From: Abhijith Das <adas@redhat.com>

This is the upstream fix for this bug. This patch differs
from the RHEL5 fix (Red Hat bz #555754) which simply writes to the 8-byte
value field of the quota. In upstream quota code, we're
required to write the entire quota (88 bytes) which can be split
across a page boundary. We check for such quotas, and read/write
the two parts from/to the corresponding pages holding these parts.

With this patch, I don't see the bug anymore using the reproducer
in Red Hat bz 555754. I successfully ran a couple of simple tests/mounts/
umounts and it doesn't seem like this patch breaks anything else.

Signed-off-by: Abhi Das <adas@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
---
 fs/gfs2/quota.c |   86 +++++++++++++++++++++++++++++++++++++++----------------
 1 files changed, 61 insertions(+), 25 deletions(-)

diff --git a/fs/gfs2/quota.c b/fs/gfs2/quota.c
index 6ca0967..d5f4661 100644
--- a/fs/gfs2/quota.c
+++ b/fs/gfs2/quota.c
@@ -637,15 +637,40 @@ static int gfs2_adjust_quota(struct gfs2_inode *ip, loff_t loc,
 	unsigned blocksize, iblock, pos;
 	struct buffer_head *bh, *dibh;
 	struct page *page;
-	void *kaddr;
-	struct gfs2_quota *qp;
-	s64 value;
-	int err = -EIO;
+	void *kaddr, *ptr;
+	struct gfs2_quota q, *qp;
+	int err, nbytes;
 	u64 size;
 
 	if (gfs2_is_stuffed(ip))
 		gfs2_unstuff_dinode(ip, NULL);
-	
+
+	memset(&q, 0, sizeof(struct gfs2_quota));
+	err = gfs2_internal_read(ip, NULL, (char *)&q, &loc, sizeof(q));
+	if (err < 0)
+		return err;
+
+	err = -EIO;
+	qp = &q;
+	qp->qu_value = be64_to_cpu(qp->qu_value);
+	qp->qu_value += change;
+	qp->qu_value = cpu_to_be64(qp->qu_value);
+	qd->qd_qb.qb_value = qp->qu_value;
+	if (fdq) {
+		if (fdq->d_fieldmask & FS_DQ_BSOFT) {
+			qp->qu_warn = cpu_to_be64(fdq->d_blk_softlimit);
+			qd->qd_qb.qb_warn = qp->qu_warn;
+		}
+		if (fdq->d_fieldmask & FS_DQ_BHARD) {
+			qp->qu_limit = cpu_to_be64(fdq->d_blk_hardlimit);
+			qd->qd_qb.qb_limit = qp->qu_limit;
+		}
+	}
+
+	/* Write the quota into the quota file on disk */
+	ptr = qp;
+	nbytes = sizeof(struct gfs2_quota);
+get_a_page:
 	page = grab_cache_page(mapping, index);
 	if (!page)
 		return -ENOMEM;
@@ -667,7 +692,12 @@ static int gfs2_adjust_quota(struct gfs2_inode *ip, loff_t loc,
 	if (!buffer_mapped(bh)) {
 		gfs2_block_map(inode, iblock, bh, 1);
 		if (!buffer_mapped(bh))
-			goto unlock;
+			goto unlock_out;
+		/* If it's a newly allocated disk block for quota, zero it */
+		if (buffer_new(bh)) {
+			memset(bh->b_data, 0, bh->b_size);
+			set_buffer_uptodate(bh);
+		}
 	}
 
 	if (PageUptodate(page))
@@ -677,32 +707,34 @@ static int gfs2_adjust_quota(struct gfs2_inode *ip, loff_t loc,
 		ll_rw_block(READ_META, 1, &bh);
 		wait_on_buffer(bh);
 		if (!buffer_uptodate(bh))
-			goto unlock;
+			goto unlock_out;
 	}
 
 	gfs2_trans_add_bh(ip->i_gl, bh, 0);
 
 	kaddr = kmap_atomic(page, KM_USER0);
-	qp = kaddr + offset;
-	value = (s64)be64_to_cpu(qp->qu_value) + change;
-	qp->qu_value = cpu_to_be64(value);
-	qd->qd_qb.qb_value = qp->qu_value;
-	if (fdq) {
-		if (fdq->d_fieldmask & FS_DQ_BSOFT) {
-			qp->qu_warn = cpu_to_be64(fdq->d_blk_softlimit);
-			qd->qd_qb.qb_warn = qp->qu_warn;
-		}
-		if (fdq->d_fieldmask & FS_DQ_BHARD) {
-			qp->qu_limit = cpu_to_be64(fdq->d_blk_hardlimit);
-			qd->qd_qb.qb_limit = qp->qu_limit;
-		}
-	}
+	if (offset + sizeof(struct gfs2_quota) > PAGE_CACHE_SIZE)
+		nbytes = PAGE_CACHE_SIZE - offset;
+	memcpy(kaddr + offset, ptr, nbytes);
 	flush_dcache_page(page);
 	kunmap_atomic(kaddr, KM_USER0);
+	unlock_page(page);
+	page_cache_release(page);
+
+	/* If quota straddles page boundary, we need to update the rest of the
+	 * quota at the beginning of the next page */
+	if (offset != 0) { /* first page, offset is closer to PAGE_CACHE_SIZE */
+		ptr = ptr + nbytes;
+		nbytes = sizeof(struct gfs2_quota) - nbytes;
+		offset = 0;
+		index++;
+		goto get_a_page;
+	}
 
+	/* Update the disk inode timestamp and size (if extended) */
 	err = gfs2_meta_inode_buffer(ip, &dibh);
 	if (err)
-		goto unlock;
+		goto out;
 
 	size = loc + sizeof(struct gfs2_quota);
 	if (size > inode->i_size) {
@@ -715,7 +747,9 @@ static int gfs2_adjust_quota(struct gfs2_inode *ip, loff_t loc,
 	brelse(dibh);
 	mark_inode_dirty(inode);
 
-unlock:
+out:
+	return err;
+unlock_out:
 	unlock_page(page);
 	page_cache_release(page);
 	return err;
@@ -779,8 +813,10 @@ static int do_sync(unsigned int num_qd, struct gfs2_quota_data **qda)
 	 * rgrp since it won't be allocated during the transaction
 	 */
 	al->al_requested = 1;
-	/* +1 in the end for block requested above for unstuffing */
-	blocks = num_qd * data_blocks + RES_DINODE + num_qd + 1;
+	/* +3 in the end for unstuffing block, inode size update block
+	 * and another block in case quota straddles page boundary and 
+	 * two blocks need to be updated instead of 1 */
+	blocks = num_qd * data_blocks + RES_DINODE + num_qd + 3;
 
 	if (nalloc)
 		al->al_requested += nalloc * (data_blocks + ind_blocks);		
-- 
1.6.2.5



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Cluster-devel] [PATCH 09/11] GFS2: Eliminate useless err variable
  2010-05-17 12:40               ` [Cluster-devel] [PATCH 08/11] GFS2: Fix writing to non-page aligned gfs2_quota structures Steven Whitehouse
@ 2010-05-17 12:40                 ` Steven Whitehouse
  2010-05-17 12:40                   ` [Cluster-devel] [PATCH 10/11] GFS2: stuck in inode wait, no glocks stuck Steven Whitehouse
  0 siblings, 1 reply; 27+ messages in thread
From: Steven Whitehouse @ 2010-05-17 12:40 UTC (permalink / raw)
  To: cluster-devel.redhat.com

From: Bob Peterson <rpeterso@redhat.com>

This patch removes an unneeded "err" variable that is always
returned as zero.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
---
 fs/gfs2/meta_io.c |    4 +---
 1 files changed, 1 insertions(+), 3 deletions(-)

diff --git a/fs/gfs2/meta_io.c b/fs/gfs2/meta_io.c
index abafda1..18176d0 100644
--- a/fs/gfs2/meta_io.c
+++ b/fs/gfs2/meta_io.c
@@ -34,7 +34,6 @@
 
 static int gfs2_aspace_writepage(struct page *page, struct writeback_control *wbc)
 {
-	int err;
 	struct buffer_head *bh, *head;
 	int nr_underway = 0;
 	int write_op = (1 << BIO_RW_META) | ((wbc->sync_mode == WB_SYNC_ALL ?
@@ -86,11 +85,10 @@ static int gfs2_aspace_writepage(struct page *page, struct writeback_control *wb
 	} while (bh != head);
 	unlock_page(page);
 
-	err = 0;
 	if (nr_underway == 0)
 		end_page_writeback(page);
 
-	return err;
+	return 0;
 }
 
 const struct address_space_operations gfs2_meta_aops = {
-- 
1.6.2.5



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Cluster-devel] [PATCH 10/11] GFS2: stuck in inode wait, no glocks stuck
  2010-05-17 12:40                 ` [Cluster-devel] [PATCH 09/11] GFS2: Eliminate useless err variable Steven Whitehouse
@ 2010-05-17 12:40                   ` Steven Whitehouse
  2010-05-17 12:40                     ` [Cluster-devel] [PATCH 11/11] GFS2: Fix typo Steven Whitehouse
  0 siblings, 1 reply; 27+ messages in thread
From: Steven Whitehouse @ 2010-05-17 12:40 UTC (permalink / raw)
  To: cluster-devel.redhat.com

From: Bob Peterson <rpeterso@redhat.com>

This patch changes the lock ordering when gfs2 reclaims
unlinked dinodes, thereby avoiding a livelock.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
---
 fs/gfs2/rgrp.c |   78 +++++++++++++++++++++----------------------------------
 1 files changed, 30 insertions(+), 48 deletions(-)

diff --git a/fs/gfs2/rgrp.c b/fs/gfs2/rgrp.c
index 3739155..8bce73e 100644
--- a/fs/gfs2/rgrp.c
+++ b/fs/gfs2/rgrp.c
@@ -952,16 +952,14 @@ static int try_rgrp_fit(struct gfs2_rgrpd *rgd, struct gfs2_alloc *al)
  *          The inode, if one has been found, in inode.
  */
 
-static int try_rgrp_unlink(struct gfs2_rgrpd *rgd, u64 *last_unlinked,
-			   u64 skip, struct inode **inode)
+static u64 try_rgrp_unlink(struct gfs2_rgrpd *rgd, u64 *last_unlinked,
+			   u64 skip)
 {
 	u32 goal = 0, block;
 	u64 no_addr;
 	struct gfs2_sbd *sdp = rgd->rd_sbd;
 	unsigned int n;
-	int error = 0;
 
-	*inode = NULL;
 	for(;;) {
 		if (goal >= rgd->rd_data)
 			break;
@@ -981,10 +979,7 @@ static int try_rgrp_unlink(struct gfs2_rgrpd *rgd, u64 *last_unlinked,
 		if (no_addr == skip)
 			continue;
 		*last_unlinked = no_addr;
-		error = gfs2_unlinked_inode_lookup(rgd->rd_sbd->sd_vfs,
-						   no_addr, inode);
-		if (*inode || error)
-			return error;
+		return no_addr;
 	}
 
 	rgd->rd_flags &= ~GFS2_RDF_CHECK;
@@ -1069,11 +1064,12 @@ static void forward_rgrp_set(struct gfs2_sbd *sdp, struct gfs2_rgrpd *rgd)
  * Try to acquire rgrp in way which avoids contending with others.
  *
  * Returns: errno
+ *          unlinked: the block address of an unlinked block to be reclaimed
  */
 
-static struct inode *get_local_rgrp(struct gfs2_inode *ip, u64 *last_unlinked)
+static int get_local_rgrp(struct gfs2_inode *ip, u64 *unlinked,
+			  u64 *last_unlinked)
 {
-	struct inode *inode = NULL;
 	struct gfs2_sbd *sdp = GFS2_SB(&ip->i_inode);
 	struct gfs2_rgrpd *rgd, *begin = NULL;
 	struct gfs2_alloc *al = ip->i_alloc;
@@ -1082,6 +1078,7 @@ static struct inode *get_local_rgrp(struct gfs2_inode *ip, u64 *last_unlinked)
 	int loops = 0;
 	int error, rg_locked;
 
+	*unlinked = 0;
 	rgd = gfs2_blk2rgrpd(sdp, ip->i_goal);
 
 	while (rgd) {
@@ -1103,29 +1100,19 @@ static struct inode *get_local_rgrp(struct gfs2_inode *ip, u64 *last_unlinked)
 			   because that would require an iput which can only
 			   happen after the rgrp is unlocked. */
 			if (!rg_locked && rgd->rd_flags & GFS2_RDF_CHECK)
-				error = try_rgrp_unlink(rgd, last_unlinked,
-							ip->i_no_addr, &inode);
+				*unlinked = try_rgrp_unlink(rgd, last_unlinked,
+							   ip->i_no_addr);
 			if (!rg_locked)
 				gfs2_glock_dq_uninit(&al->al_rgd_gh);
-			if (inode) {
-				if (error) {
-					if (inode->i_state & I_NEW)
-						iget_failed(inode);
-					else
-						iput(inode);
-					return ERR_PTR(error);
-				}
-				return inode;
-			}
-			if (error)
-				return ERR_PTR(error);
+			if (*unlinked)
+				return -EAGAIN;
 			/* fall through */
 		case GLR_TRYFAILED:
 			rgd = recent_rgrp_next(rgd);
 			break;
 
 		default:
-			return ERR_PTR(error);
+			return error;
 		}
 	}
 
@@ -1148,22 +1135,12 @@ static struct inode *get_local_rgrp(struct gfs2_inode *ip, u64 *last_unlinked)
 			if (try_rgrp_fit(rgd, al))
 				goto out;
 			if (!rg_locked && rgd->rd_flags & GFS2_RDF_CHECK)
-				error = try_rgrp_unlink(rgd, last_unlinked,
-							ip->i_no_addr, &inode);
+				*unlinked = try_rgrp_unlink(rgd, last_unlinked,
+							    ip->i_no_addr);
 			if (!rg_locked)
 				gfs2_glock_dq_uninit(&al->al_rgd_gh);
-			if (inode) {
-				if (error) {
-					if (inode->i_state & I_NEW)
-						iget_failed(inode);
-					else
-						iput(inode);
-					return ERR_PTR(error);
-				}
-				return inode;
-			}
-			if (error)
-				return ERR_PTR(error);
+			if (*unlinked)
+				return -EAGAIN;
 			break;
 
 		case GLR_TRYFAILED:
@@ -1171,7 +1148,7 @@ static struct inode *get_local_rgrp(struct gfs2_inode *ip, u64 *last_unlinked)
 			break;
 
 		default:
-			return ERR_PTR(error);
+			return error;
 		}
 
 		rgd = gfs2_rgrpd_get_next(rgd);
@@ -1180,7 +1157,7 @@ static struct inode *get_local_rgrp(struct gfs2_inode *ip, u64 *last_unlinked)
 
 		if (rgd == begin) {
 			if (++loops >= 3)
-				return ERR_PTR(-ENOSPC);
+				return -ENOSPC;
 			if (!skipped)
 				loops++;
 			flags = 0;
@@ -1200,7 +1177,7 @@ out:
 		forward_rgrp_set(sdp, rgd);
 	}
 
-	return NULL;
+	return 0;
 }
 
 /**
@@ -1216,7 +1193,7 @@ int gfs2_inplace_reserve_i(struct gfs2_inode *ip, char *file, unsigned int line)
 	struct gfs2_alloc *al = ip->i_alloc;
 	struct inode *inode;
 	int error = 0;
-	u64 last_unlinked = NO_BLOCK;
+	u64 last_unlinked = NO_BLOCK, unlinked;
 
 	if (gfs2_assert_warn(sdp, al->al_requested))
 		return -EINVAL;
@@ -1232,14 +1209,19 @@ try_again:
 	if (error)
 		return error;
 
-	inode = get_local_rgrp(ip, &last_unlinked);
-	if (inode) {
+	error = get_local_rgrp(ip, &unlinked, &last_unlinked);
+	if (error) {
 		if (ip != GFS2_I(sdp->sd_rindex))
 			gfs2_glock_dq_uninit(&al->al_ri_gh);
-		if (IS_ERR(inode))
-			return PTR_ERR(inode);
-		iput(inode);
+		if (error != -EAGAIN)
+			return error;
+		error = gfs2_unlinked_inode_lookup(ip->i_inode.i_sb,
+						   unlinked, &inode);
+		if (inode)
+			iput(inode);
 		gfs2_log_flush(sdp, NULL);
+		if (error == GLR_TRYFAILED)
+			error = 0;
 		goto try_again;
 	}
 
-- 
1.6.2.5



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Cluster-devel] [PATCH 11/11] GFS2: Fix typo
  2010-05-17 12:40                   ` [Cluster-devel] [PATCH 10/11] GFS2: stuck in inode wait, no glocks stuck Steven Whitehouse
@ 2010-05-17 12:40                     ` Steven Whitehouse
  0 siblings, 0 replies; 27+ messages in thread
From: Steven Whitehouse @ 2010-05-17 12:40 UTC (permalink / raw)
  To: cluster-devel.redhat.com

A missing ! in a test.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
---
 fs/gfs2/sys.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/gfs2/sys.c b/fs/gfs2/sys.c
index 7afb62e..68d2795 100644
--- a/fs/gfs2/sys.c
+++ b/fs/gfs2/sys.c
@@ -233,7 +233,7 @@ static ssize_t demote_rq_store(struct gfs2_sbd *sdp, const char *buf, size_t len
 	glops = gfs2_glops_list[gltype];
 	if (glops == NULL)
 		return -EINVAL;
-	if (test_and_set_bit(SDF_DEMOTE, &sdp->sd_flags))
+	if (!test_and_set_bit(SDF_DEMOTE, &sdp->sd_flags))
 		fs_info(sdp, "demote interface used\n");
 	rv = gfs2_glock_get(sdp, glnum, glops, 0, &gl);
 	if (rv)
-- 
1.6.2.5



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [Cluster-devel] GFS2: Pre-pull patch posting
@ 2015-02-10 10:36 Steven Whitehouse
  0 siblings, 0 replies; 27+ messages in thread
From: Steven Whitehouse @ 2015-02-10 10:36 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This time we have mostly clean ups. There is a bug fix for a NULL dereference
relating to ACLs, and another which improves (but does not fix entirely) an
allocation fall-back code path. The other three patches are small clean ups.

Steve.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Cluster-devel] GFS2: Pre-pull patch posting
@ 2013-04-05  9:57 Steven Whitehouse
  0 siblings, 0 replies; 27+ messages in thread
From: Steven Whitehouse @ 2013-04-05  9:57 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

Here are a few GFS2 fixes which are pending. There are two patches
which fix up a couple of minor issues in the DLM interface code,
a missing error path in gfs2_rs_alloc(), two patches which fix problems
during "withdraw" and a fix for discards/FITRIM when using 4k sector
sized devices,

Steve.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Cluster-devel] GFS2: Pre-pull patch posting
@ 2013-01-03 11:50 Steven Whitehouse
  0 siblings, 0 replies; 27+ messages in thread
From: Steven Whitehouse @ 2013-01-03 11:50 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

Here are four small bug fixes for GFS2. There is no common theme here
really, just a few items that were fixed recently. The first fixes
lock name generation when the glock number is 0. The second fixes a
race allocating reservation structures and the final two fix a performance
issue by making small changes in the allocation code,

Steve.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Cluster-devel] GFS2: Pre-pull patch posting
@ 2010-10-18 14:15 Steven Whitehouse
  0 siblings, 0 replies; 27+ messages in thread
From: Steven Whitehouse @ 2010-10-18 14:15 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

I know the merge window isn't open yet, but at this stage I'm going to
hold off on any larger patches until the following merge window so this
patch set isn't likely to change much, hence kicking it out a bit early
for review.

There are a few interesting points to note in this patch set:
 o GFS2 is updated to use the new truncate sequence
 o Support for fallocate is added
 o Clean up of some unused/obsolete mount options

I'm currently working on a patch to allow the glock hash table
to use RCU. That is currently a work-in-progress and that will
hopefully be ready for the succeeding merge window.

Steve.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Cluster-devel] GFS2: Pre-pull patch posting
@ 2010-08-02  9:27 Steven Whitehouse
  0 siblings, 0 replies; 27+ messages in thread
From: Steven Whitehouse @ 2010-08-02  9:27 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

Here is the current content of the GFS2 -nmw git tree. Mostly its
just clean up and bug fixes this time. There is one exception which
is the "wait for journal id" patch which is a new feature aimed
at (eventually) allowing us to simplify the userland support which
GFS2 requires,

Steve.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Cluster-devel] GFS2: Pre-pull patch posting
@ 2010-03-11 17:21 Steven Whitehouse
  0 siblings, 0 replies; 27+ messages in thread
From: Steven Whitehouse @ 2010-03-11 17:21 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Here are three small (but important!) fixes to GFS2.

Steve.



^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Cluster-devel] GFS2: Pre-pull patch posting
@ 2010-03-01 15:08 Steven Whitehouse
  0 siblings, 0 replies; 27+ messages in thread
From: Steven Whitehouse @ 2010-03-01 15:08 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

Not so many patches for GFS2 this merge window. The bulk of the changes are
aimed at reducing overheads when caching large numbers of inodes and
the consequent simplification of the umount code,

Steve.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Cluster-devel] GFS2: Pre-pull patch posting
@ 2009-09-10 11:27 Steven Whitehouse
  0 siblings, 0 replies; 27+ messages in thread
From: Steven Whitehouse @ 2009-09-10 11:27 UTC (permalink / raw)
  To: cluster-devel.redhat.com

As merge time is approaching, here is the current content of the
GFS2 -nmw git tree. I'm not expecting to take any more patches
now for the current merge window unless any last minute bugs
are discovered.

There is not a huge amount new this time. Some extra context for
uevent messages, better error handling during block allocation,
and a clean up of extended attribute support. There is still more
to do on the extended attribute side of things, but this is a good
start I think.

There are a few bug fixes as well. Once these patches are merged
I'm intending to start off the next -nmw tree with a patch to
remove some of the (now unused) sysfs files as per the message
on cluster-devel a few weeks back.

Steve.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Cluster-devel] GFS2: Pre-pull patch posting
@ 2009-06-10  8:30 Steven Whitehouse
  0 siblings, 0 replies; 27+ messages in thread
From: Steven Whitehouse @ 2009-06-10  8:30 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

As the merge window is more or less upon us, here is the content
of the GFS2 -nmw tree. There is nothing too startling this time
as the focus has very much been bug fixes and clean up.

We have a new mount option, commit= which does exactly the same
thing as the ext3 equivalent option. We have a long term plan to
make all the tunable parameters available as mount options and
thus to be able to eventually drop the sysfs interface to these
parameters.

Another long term plan is to get rid of the files named
ops_somethingorother and to either merge them into other
files, or rename them to not have the ops_ prefix. This
patch series makes a start on that, and does all the easy
ones. As a result some functions with only one caller are
moved to the same file as their caller and made static.

The docs are also updated to reflect the fact that the
lock_dlm interface module no longer exists and that interface is
now built into GFS2.

Steve.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Cluster-devel] [GFS2] Pre-pull patch posting
@ 2009-03-18 12:23 swhiteho
  0 siblings, 0 replies; 27+ messages in thread
From: swhiteho @ 2009-03-18 12:23 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

So as the merge window draws closer, here is the current content of the
GFS2 git tree. The major item this time is patch 5 in the series. This
contians by far the majority of the changes, and the majority of those
changes are actually removal of code. The patch merges the lock_dlm
module (not the dlm itself, but GFS2's interface to the dlm) into
GFS2 itself. This means that a number of optimisations are then
possible in terms of merging strucutures resulting in a considerable
saving in memory.

Since that patch is so large (I'm afraid that it really doesn't make
any sense to split it up) its been in the -nmw git tree for the whole
period since the last merge window and has also been posted for review
on cluster-devel on a number of occasions before that. We've run a
number of tests on it as well in that period, so I believe that its
pretty stable now. It certainly makes the code a lot cleaner and
easier to follow in that area.

The remainder of the patches are mostly bug fixes, but there are one
or two other interesting features, those being:

 o GFS2 now supports the discard I/O requests for thin provisioning, etc
 o A new "demote a glock" interface is added to sysfs to help in
   testing GFS2
 o With a new mkfs.gfs2 which writes UUIDs, the UUID is now included
   in uevent messages (with older filesystems which don't have UUIDs,
   we just don't send that information)

The GFS2 tracing patches which I posted a little while back are not
included in this patch set. I think I can see what I need to do in
order to avoid patching blktrace now, so my plan is to look at those
patches again after this merge window, and when all the queued patches
for the tracing subsystem have been merged.

As always, please let us know if you spot any issues in the patches,

Steve.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Cluster-devel] GFS2: Pre-pull patch posting
@ 2008-12-17 11:29 swhiteho
  0 siblings, 0 replies; 27+ messages in thread
From: swhiteho @ 2008-12-17 11:29 UTC (permalink / raw)
  To: cluster-devel.redhat.com

In preparation for the next merge window, here is the current content
of the GFS2 git tree. Firstly, we have one new feature, which is
support for the FIEMAP ioctl. That patch does touch some code
outside of GFS2 itself, but its the only patch in this series which
does so.

The remaining patches are mostly clean up and bug fixes, as usual.
They are working towards a point where I can submit a patch to finally
merge the lock_dlm module (not dlm itself I should emphasise) into
GFS2. That is a large patch and a preliminary version has already been
posted to cluster-devel. My plan is to put that patch into my -nmw
git tree as the first patch for the following merge window to give
it maximum exposure.

The other highlight of this patch series, is a patch which removes
the two daemons (gfs2_scand and gfs2_glockd) and replaces them with
a "shrinker" routine registered with the VM. As expected, this also
reduces the code size. We are also expecting do do a similar thing
with the GFS2 quota data strucutures at some point in the future.

Steve.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Cluster-devel] GFS2: Pre-pull patch posting
@ 2008-09-26 12:00 Steven Whitehouse
  0 siblings, 0 replies; 27+ messages in thread
From: Steven Whitehouse @ 2008-09-26 12:00 UTC (permalink / raw)
  To: cluster-devel.redhat.com

I'm guessing that the merge window opening might not be too far
away now, and in any case, I won't have quite my normal internet
access next week. So I'm pushing out the current GFS2 tree so
that (I hope) there will be time to fix any issues.

Again, there are fewer patches here. A lot of them are fairly small
too. The most noteable item deals with the meta filesystem which
was in response to Al Viro's suggestions concerning a better way
to structure that code. It certainly results in a much cleaner
implementation, so thanks go to Al for pointing that out.

A couple of new features: I/O barrier support (needs no user
configuration, see the patch for details) and UUID support
(no code changes, its all userland but we reserve space in
the super block, again details in the patch itself).

I know that I hardly need say, but please let me know if you have any
comments :-)

Steve.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Cluster-devel] [GFS2] Pre-pull patch posting
@ 2008-07-11 10:11 swhiteho
  0 siblings, 0 replies; 27+ messages in thread
From: swhiteho @ 2008-07-11 10:11 UTC (permalink / raw)
  To: cluster-devel.redhat.com

So, although the merge window isn't yet open, I'm guessing that its
probably not too far away, hence this posting of the contents of
the current GFS2 -nmw git tree.

This time the big news is locking changes, although having said that,
there are far fewer queued patches in total than I've had for previous
merge windows and I believe that this is an indication of the
growing maturity of GFS2.

The first patch in the series is really the main change and is
a big clean up of the core of the glocks, which are really the
core of GFS2 in a lot of ways. Further through the series is
a documentation patch, which explains the fine detail of how
glocks work and the assumptions made by the glock core when
calling the functions relating to individual glock types.

Other notable changes include merging the lock_nolock module
into the core of GFS2 since there is little point in retaining
it separately. There is a plan to do the same to lock_dlm as
well in the future (not the DLM itself obviously, just the
interface module thats part of GFS2).

Most of the remaing changes are bug fixes or futher optimisations
over the initial glock changes, plus one or two minor clean ups
along the way.

Steve.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Cluster-devel] [GFS2] Pre-pull patch posting
@ 2008-04-17  8:37 swhiteho
  0 siblings, 0 replies; 27+ messages in thread
From: swhiteho @ 2008-04-17  8:37 UTC (permalink / raw)
  To: cluster-devel.redhat.com

This is the current content of the GFS2 -nmw git tree. Mostly bug
fixes, there are some changes relating to block mapping which are
working towards cleaning up this code and allowing more efficient
block mapping. There is a second part to that work which is not
included in this patch set - the plan is that it will be in the
next patch set and its currently undergoing testing.

There are a number of clean up patches in the series too. We have
been continuing the work of gradually reducing the fields in the
various ..._host structures with a view to eventually eliminating
them completely. They were introduced as a stop-gap measure to fix
the endianess annotation and the fields are now gradually being moved
to other structures (or eliminated).

Bob Peterson's new improved "bitfit" algorithm provides a nice
speed up when we are allocating blocks as well as cleaning up
that area of the code.

Steve.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [Cluster-devel] [GFS2] Pre-pull patch posting
@ 2008-01-21  9:21 swhiteho
  0 siblings, 0 replies; 27+ messages in thread
From: swhiteho @ 2008-01-21  9:21 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

Here is the current GFS2 patch queue. You'll notice that this time
there are no DLM patches in this list. That is because the DLM team
are setting up their own git tree and this future DLM patches will
be sent directly by them rather than via the GFS2 tree.

Most of this set of patches is clean up and bug fixes, there is really
not a lot new this time. I guess the most significant thing is the
patch to use ->page_mkwrite which will greatly increase efficiency
when files opened r/w are mostly only accessed for reading across a
cluster.

There are a number of cleanups related to journalling which is really
where the largest number of changes in terms of code lines is. The
indirect blocks for the journal are now scanned once only at mount
time and the bmap information is retained in the form of an extent
list. Since we expect journals to consist of only a single extent
in the normal case, this should generally be quite a short list :-)

In addition some of the tunables relating to the journal have been
removed in favour of autotuning those variables.

Steve.

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2015-02-10 10:36 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-05-17 12:40 [Cluster-devel] GFS2: Pre-pull patch posting Steven Whitehouse
2010-05-17 12:40 ` [Cluster-devel] [PATCH 01/11] GFS2: Remove space from slab cache name Steven Whitehouse
2010-05-17 12:40   ` [Cluster-devel] [PATCH 02/11] GFS2: docs update Steven Whitehouse
2010-05-17 12:40     ` [Cluster-devel] [PATCH 03/11] GFS2: Clean up stuffed file copying Steven Whitehouse
2010-05-17 12:40       ` [Cluster-devel] [PATCH 04/11] GFS2: glock livelock Steven Whitehouse
2010-05-17 12:40         ` [Cluster-devel] [PATCH 05/11] GFS2: Various gfs2_logd improvements Steven Whitehouse
2010-05-17 12:40           ` [Cluster-devel] [PATCH 06/11] GFS2: fix quota state reporting Steven Whitehouse
2010-05-17 12:40             ` [Cluster-devel] [PATCH 07/11] GFS2: Add some useful messages Steven Whitehouse
2010-05-17 12:40               ` [Cluster-devel] [PATCH 08/11] GFS2: Fix writing to non-page aligned gfs2_quota structures Steven Whitehouse
2010-05-17 12:40                 ` [Cluster-devel] [PATCH 09/11] GFS2: Eliminate useless err variable Steven Whitehouse
2010-05-17 12:40                   ` [Cluster-devel] [PATCH 10/11] GFS2: stuck in inode wait, no glocks stuck Steven Whitehouse
2010-05-17 12:40                     ` [Cluster-devel] [PATCH 11/11] GFS2: Fix typo Steven Whitehouse
  -- strict thread matches above, loose matches on Subject: below --
2015-02-10 10:36 [Cluster-devel] GFS2: Pre-pull patch posting Steven Whitehouse
2013-04-05  9:57 Steven Whitehouse
2013-01-03 11:50 Steven Whitehouse
2010-10-18 14:15 Steven Whitehouse
2010-08-02  9:27 Steven Whitehouse
2010-03-11 17:21 Steven Whitehouse
2010-03-01 15:08 Steven Whitehouse
2009-09-10 11:27 Steven Whitehouse
2009-06-10  8:30 Steven Whitehouse
2009-03-18 12:23 [Cluster-devel] [GFS2] " swhiteho
2008-12-17 11:29 [Cluster-devel] GFS2: " swhiteho
2008-09-26 12:00 Steven Whitehouse
2008-07-11 10:11 [Cluster-devel] [GFS2] " swhiteho
2008-04-17  8:37 swhiteho
2008-01-21  9:21 swhiteho

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).