* [Cluster-devel] [GFS2] Pre-pull patch posting
@ 2008-01-21 9:21 swhiteho
0 siblings, 0 replies; 38+ messages in thread
From: swhiteho @ 2008-01-21 9:21 UTC (permalink / raw)
To: cluster-devel.redhat.com
Hi,
Here is the current GFS2 patch queue. You'll notice that this time
there are no DLM patches in this list. That is because the DLM team
are setting up their own git tree and this future DLM patches will
be sent directly by them rather than via the GFS2 tree.
Most of this set of patches is clean up and bug fixes, there is really
not a lot new this time. I guess the most significant thing is the
patch to use ->page_mkwrite which will greatly increase efficiency
when files opened r/w are mostly only accessed for reading across a
cluster.
There are a number of cleanups related to journalling which is really
where the largest number of changes in terms of code lines is. The
indirect blocks for the journal are now scanned once only at mount
time and the bmap information is retained in the form of an extent
list. Since we expect journals to consist of only a single extent
in the normal case, this should generally be quite a short list :-)
In addition some of the tunables relating to the journal have been
removed in favour of autotuning those variables.
Steve.
^ permalink raw reply [flat|nested] 38+ messages in thread
* [Cluster-devel] [GFS2] Pre-pull patch posting
@ 2008-04-17 8:37 swhiteho
0 siblings, 0 replies; 38+ messages in thread
From: swhiteho @ 2008-04-17 8:37 UTC (permalink / raw)
To: cluster-devel.redhat.com
This is the current content of the GFS2 -nmw git tree. Mostly bug
fixes, there are some changes relating to block mapping which are
working towards cleaning up this code and allowing more efficient
block mapping. There is a second part to that work which is not
included in this patch set - the plan is that it will be in the
next patch set and its currently undergoing testing.
There are a number of clean up patches in the series too. We have
been continuing the work of gradually reducing the fields in the
various ..._host structures with a view to eventually eliminating
them completely. They were introduced as a stop-gap measure to fix
the endianess annotation and the fields are now gradually being moved
to other structures (or eliminated).
Bob Peterson's new improved "bitfit" algorithm provides a nice
speed up when we are allocating blocks as well as cleaning up
that area of the code.
Steve.
^ permalink raw reply [flat|nested] 38+ messages in thread
* [Cluster-devel] [GFS2] Pre-pull patch posting
@ 2008-07-11 10:11 swhiteho
0 siblings, 0 replies; 38+ messages in thread
From: swhiteho @ 2008-07-11 10:11 UTC (permalink / raw)
To: cluster-devel.redhat.com
So, although the merge window isn't yet open, I'm guessing that its
probably not too far away, hence this posting of the contents of
the current GFS2 -nmw git tree.
This time the big news is locking changes, although having said that,
there are far fewer queued patches in total than I've had for previous
merge windows and I believe that this is an indication of the
growing maturity of GFS2.
The first patch in the series is really the main change and is
a big clean up of the core of the glocks, which are really the
core of GFS2 in a lot of ways. Further through the series is
a documentation patch, which explains the fine detail of how
glocks work and the assumptions made by the glock core when
calling the functions relating to individual glock types.
Other notable changes include merging the lock_nolock module
into the core of GFS2 since there is little point in retaining
it separately. There is a plan to do the same to lock_dlm as
well in the future (not the DLM itself obviously, just the
interface module thats part of GFS2).
Most of the remaing changes are bug fixes or futher optimisations
over the initial glock changes, plus one or two minor clean ups
along the way.
Steve.
^ permalink raw reply [flat|nested] 38+ messages in thread
* [Cluster-devel] GFS2: Pre-pull patch posting
@ 2008-09-26 12:00 Steven Whitehouse
0 siblings, 0 replies; 38+ messages in thread
From: Steven Whitehouse @ 2008-09-26 12:00 UTC (permalink / raw)
To: cluster-devel.redhat.com
I'm guessing that the merge window opening might not be too far
away now, and in any case, I won't have quite my normal internet
access next week. So I'm pushing out the current GFS2 tree so
that (I hope) there will be time to fix any issues.
Again, there are fewer patches here. A lot of them are fairly small
too. The most noteable item deals with the meta filesystem which
was in response to Al Viro's suggestions concerning a better way
to structure that code. It certainly results in a much cleaner
implementation, so thanks go to Al for pointing that out.
A couple of new features: I/O barrier support (needs no user
configuration, see the patch for details) and UUID support
(no code changes, its all userland but we reserve space in
the super block, again details in the patch itself).
I know that I hardly need say, but please let me know if you have any
comments :-)
Steve.
^ permalink raw reply [flat|nested] 38+ messages in thread
* [Cluster-devel] GFS2: Pre-pull patch posting
@ 2008-12-17 11:29 swhiteho
0 siblings, 0 replies; 38+ messages in thread
From: swhiteho @ 2008-12-17 11:29 UTC (permalink / raw)
To: cluster-devel.redhat.com
In preparation for the next merge window, here is the current content
of the GFS2 git tree. Firstly, we have one new feature, which is
support for the FIEMAP ioctl. That patch does touch some code
outside of GFS2 itself, but its the only patch in this series which
does so.
The remaining patches are mostly clean up and bug fixes, as usual.
They are working towards a point where I can submit a patch to finally
merge the lock_dlm module (not dlm itself I should emphasise) into
GFS2. That is a large patch and a preliminary version has already been
posted to cluster-devel. My plan is to put that patch into my -nmw
git tree as the first patch for the following merge window to give
it maximum exposure.
The other highlight of this patch series, is a patch which removes
the two daemons (gfs2_scand and gfs2_glockd) and replaces them with
a "shrinker" routine registered with the VM. As expected, this also
reduces the code size. We are also expecting do do a similar thing
with the GFS2 quota data strucutures at some point in the future.
Steve.
^ permalink raw reply [flat|nested] 38+ messages in thread
* [Cluster-devel] [GFS2] Pre-pull patch posting
@ 2009-03-18 12:23 swhiteho
0 siblings, 0 replies; 38+ messages in thread
From: swhiteho @ 2009-03-18 12:23 UTC (permalink / raw)
To: cluster-devel.redhat.com
Hi,
So as the merge window draws closer, here is the current content of the
GFS2 git tree. The major item this time is patch 5 in the series. This
contians by far the majority of the changes, and the majority of those
changes are actually removal of code. The patch merges the lock_dlm
module (not the dlm itself, but GFS2's interface to the dlm) into
GFS2 itself. This means that a number of optimisations are then
possible in terms of merging strucutures resulting in a considerable
saving in memory.
Since that patch is so large (I'm afraid that it really doesn't make
any sense to split it up) its been in the -nmw git tree for the whole
period since the last merge window and has also been posted for review
on cluster-devel on a number of occasions before that. We've run a
number of tests on it as well in that period, so I believe that its
pretty stable now. It certainly makes the code a lot cleaner and
easier to follow in that area.
The remainder of the patches are mostly bug fixes, but there are one
or two other interesting features, those being:
o GFS2 now supports the discard I/O requests for thin provisioning, etc
o A new "demote a glock" interface is added to sysfs to help in
testing GFS2
o With a new mkfs.gfs2 which writes UUIDs, the UUID is now included
in uevent messages (with older filesystems which don't have UUIDs,
we just don't send that information)
The GFS2 tracing patches which I posted a little while back are not
included in this patch set. I think I can see what I need to do in
order to avoid patching blktrace now, so my plan is to look at those
patches again after this merge window, and when all the queued patches
for the tracing subsystem have been merged.
As always, please let us know if you spot any issues in the patches,
Steve.
^ permalink raw reply [flat|nested] 38+ messages in thread
* [Cluster-devel] GFS2: Pre-pull patch posting
@ 2009-06-10 8:30 Steven Whitehouse
0 siblings, 0 replies; 38+ messages in thread
From: Steven Whitehouse @ 2009-06-10 8:30 UTC (permalink / raw)
To: cluster-devel.redhat.com
Hi,
As the merge window is more or less upon us, here is the content
of the GFS2 -nmw tree. There is nothing too startling this time
as the focus has very much been bug fixes and clean up.
We have a new mount option, commit= which does exactly the same
thing as the ext3 equivalent option. We have a long term plan to
make all the tunable parameters available as mount options and
thus to be able to eventually drop the sysfs interface to these
parameters.
Another long term plan is to get rid of the files named
ops_somethingorother and to either merge them into other
files, or rename them to not have the ops_ prefix. This
patch series makes a start on that, and does all the easy
ones. As a result some functions with only one caller are
moved to the same file as their caller and made static.
The docs are also updated to reflect the fact that the
lock_dlm interface module no longer exists and that interface is
now built into GFS2.
Steve.
^ permalink raw reply [flat|nested] 38+ messages in thread
* [Cluster-devel] GFS2: Pre-pull patch posting
@ 2009-09-10 11:27 Steven Whitehouse
0 siblings, 0 replies; 38+ messages in thread
From: Steven Whitehouse @ 2009-09-10 11:27 UTC (permalink / raw)
To: cluster-devel.redhat.com
As merge time is approaching, here is the current content of the
GFS2 -nmw git tree. I'm not expecting to take any more patches
now for the current merge window unless any last minute bugs
are discovered.
There is not a huge amount new this time. Some extra context for
uevent messages, better error handling during block allocation,
and a clean up of extended attribute support. There is still more
to do on the extended attribute side of things, but this is a good
start I think.
There are a few bug fixes as well. Once these patches are merged
I'm intending to start off the next -nmw tree with a patch to
remove some of the (now unused) sysfs files as per the message
on cluster-devel a few weeks back.
Steve.
^ permalink raw reply [flat|nested] 38+ messages in thread
* [Cluster-devel] GFS2: Pre-pull patch posting
@ 2010-03-01 15:08 Steven Whitehouse
0 siblings, 0 replies; 38+ messages in thread
From: Steven Whitehouse @ 2010-03-01 15:08 UTC (permalink / raw)
To: cluster-devel.redhat.com
Hi,
Not so many patches for GFS2 this merge window. The bulk of the changes are
aimed at reducing overheads when caching large numbers of inodes and
the consequent simplification of the umount code,
Steve.
^ permalink raw reply [flat|nested] 38+ messages in thread
* [Cluster-devel] GFS2: Pre-pull patch posting
@ 2010-03-11 17:21 Steven Whitehouse
0 siblings, 0 replies; 38+ messages in thread
From: Steven Whitehouse @ 2010-03-11 17:21 UTC (permalink / raw)
To: cluster-devel.redhat.com
Here are three small (but important!) fixes to GFS2.
Steve.
^ permalink raw reply [flat|nested] 38+ messages in thread
* [Cluster-devel] GFS2: Pre-pull patch posting
@ 2010-05-17 12:40 Steven Whitehouse
0 siblings, 0 replies; 38+ messages in thread
From: Steven Whitehouse @ 2010-05-17 12:40 UTC (permalink / raw)
To: cluster-devel.redhat.com
Hi,
Nothing very exciting this time.... mostly minor bug fixes and
a docs update. The gfs2_logd patch has been hanging around for
a long time and is now finally integrated. It is the first step
towards a longer term goal of improving performance in that
area,
Steve.
^ permalink raw reply [flat|nested] 38+ messages in thread
* [Cluster-devel] GFS2: Pre-pull patch posting
@ 2010-08-02 9:27 Steven Whitehouse
0 siblings, 0 replies; 38+ messages in thread
From: Steven Whitehouse @ 2010-08-02 9:27 UTC (permalink / raw)
To: cluster-devel.redhat.com
Hi,
Here is the current content of the GFS2 -nmw git tree. Mostly its
just clean up and bug fixes this time. There is one exception which
is the "wait for journal id" patch which is a new feature aimed
at (eventually) allowing us to simplify the userland support which
GFS2 requires,
Steve.
^ permalink raw reply [flat|nested] 38+ messages in thread
* [Cluster-devel] GFS2: Pre-pull patch posting
@ 2010-10-18 14:15 Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 01/22] GFS2: New truncate sequence Steven Whitehouse
` (21 more replies)
0 siblings, 22 replies; 38+ messages in thread
From: Steven Whitehouse @ 2010-10-18 14:15 UTC (permalink / raw)
To: cluster-devel.redhat.com
Hi,
I know the merge window isn't open yet, but at this stage I'm going to
hold off on any larger patches until the following merge window so this
patch set isn't likely to change much, hence kicking it out a bit early
for review.
There are a few interesting points to note in this patch set:
o GFS2 is updated to use the new truncate sequence
o Support for fallocate is added
o Clean up of some unused/obsolete mount options
I'm currently working on a patch to allow the glock hash table
to use RCU. That is currently a work-in-progress and that will
hopefully be ready for the succeeding merge window.
Steve.
^ permalink raw reply [flat|nested] 38+ messages in thread
* [Cluster-devel] [PATCH 01/22] GFS2: New truncate sequence
2010-10-18 14:15 [Cluster-devel] GFS2: Pre-pull patch posting Steven Whitehouse
@ 2010-10-18 14:15 ` Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 02/22] GFS2: Remove i_disksize Steven Whitehouse
` (20 subsequent siblings)
21 siblings, 0 replies; 38+ messages in thread
From: Steven Whitehouse @ 2010-10-18 14:15 UTC (permalink / raw)
To: cluster-devel.redhat.com
This updates GFS2's truncate code to use the new truncate
sequence correctly. This is a stepping stone to being
able to remove ip->i_disksize in favour of using i_size
everywhere now that the two sizes are always identical.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Christoph Hellwig <hch@lst.de>
diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c
index 194fe16..f687f25 100644
--- a/fs/gfs2/aops.c
+++ b/fs/gfs2/aops.c
@@ -696,13 +696,11 @@ out:
page_cache_release(page);
- /*
- * XXX(truncate): the call below should probably be replaced with
- * a call to the gfs2-specific truncate blocks helper to actually
- * release disk blocks..
- */
+ gfs2_trans_end(sdp);
if (pos + len > ip->i_inode.i_size)
- truncate_setsize(&ip->i_inode, ip->i_inode.i_size);
+ gfs2_trim_blocks(&ip->i_inode);
+ goto out_trans_fail;
+
out_endtrans:
gfs2_trans_end(sdp);
out_trans_fail:
diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
index 6f48280..20b971a 100644
--- a/fs/gfs2/bmap.c
+++ b/fs/gfs2/bmap.c
@@ -50,7 +50,7 @@ struct strip_mine {
* @ip: the inode
* @dibh: the dinode buffer
* @block: the block number that was allocated
- * @private: any locked page held by the caller process
+ * @page: The (optional) page. This is looked up if @page is NULL
*
* Returns: errno
*/
@@ -109,8 +109,7 @@ static int gfs2_unstuffer_page(struct gfs2_inode *ip, struct buffer_head *dibh,
/**
* gfs2_unstuff_dinode - Unstuff a dinode when the data has grown too big
* @ip: The GFS2 inode to unstuff
- * @unstuffer: the routine that handles unstuffing a non-zero length file
- * @private: private data for the unstuffer
+ * @page: The (optional) page. This is looked up if the @page is NULL
*
* This routine unstuffs a dinode and returns it to a "normal" state such
* that the height can be grown in the traditional way.
@@ -885,83 +884,14 @@ out:
}
/**
- * do_grow - Make a file look bigger than it is
- * @ip: the inode
- * @size: the size to set the file to
- *
- * Called with an exclusive lock on @ip.
- *
- * Returns: errno
- */
-
-static int do_grow(struct gfs2_inode *ip, u64 size)
-{
- struct gfs2_sbd *sdp = GFS2_SB(&ip->i_inode);
- struct gfs2_alloc *al;
- struct buffer_head *dibh;
- int error;
-
- al = gfs2_alloc_get(ip);
- if (!al)
- return -ENOMEM;
-
- error = gfs2_quota_lock_check(ip);
- if (error)
- goto out;
-
- al->al_requested = sdp->sd_max_height + RES_DATA;
-
- error = gfs2_inplace_reserve(ip);
- if (error)
- goto out_gunlock_q;
-
- error = gfs2_trans_begin(sdp,
- sdp->sd_max_height + al->al_rgd->rd_length +
- RES_JDATA + RES_DINODE + RES_STATFS + RES_QUOTA, 0);
- if (error)
- goto out_ipres;
-
- error = gfs2_meta_inode_buffer(ip, &dibh);
- if (error)
- goto out_end_trans;
-
- if (size > sdp->sd_sb.sb_bsize - sizeof(struct gfs2_dinode)) {
- if (gfs2_is_stuffed(ip)) {
- error = gfs2_unstuff_dinode(ip, NULL);
- if (error)
- goto out_brelse;
- }
- }
-
- ip->i_disksize = size;
- ip->i_inode.i_mtime = ip->i_inode.i_ctime = CURRENT_TIME;
- gfs2_trans_add_bh(ip->i_gl, dibh, 1);
- gfs2_dinode_out(ip, dibh->b_data);
-
-out_brelse:
- brelse(dibh);
-out_end_trans:
- gfs2_trans_end(sdp);
-out_ipres:
- gfs2_inplace_release(ip);
-out_gunlock_q:
- gfs2_quota_unlock(ip);
-out:
- gfs2_alloc_put(ip);
- return error;
-}
-
-
-/**
* gfs2_block_truncate_page - Deal with zeroing out data for truncate
*
* This is partly borrowed from ext3.
*/
-static int gfs2_block_truncate_page(struct address_space *mapping)
+static int gfs2_block_truncate_page(struct address_space *mapping, loff_t from)
{
struct inode *inode = mapping->host;
struct gfs2_inode *ip = GFS2_I(inode);
- loff_t from = inode->i_size;
unsigned long index = from >> PAGE_CACHE_SHIFT;
unsigned offset = from & (PAGE_CACHE_SIZE-1);
unsigned blocksize, iblock, length, pos;
@@ -1023,9 +953,11 @@ unlock:
return err;
}
-static int trunc_start(struct gfs2_inode *ip, u64 size)
+static int trunc_start(struct inode *inode, u64 oldsize, u64 newsize)
{
- struct gfs2_sbd *sdp = GFS2_SB(&ip->i_inode);
+ struct gfs2_inode *ip = GFS2_I(inode);
+ struct gfs2_sbd *sdp = GFS2_SB(inode);
+ struct address_space *mapping = inode->i_mapping;
struct buffer_head *dibh;
int journaled = gfs2_is_jdata(ip);
int error;
@@ -1039,31 +971,27 @@ static int trunc_start(struct gfs2_inode *ip, u64 size)
if (error)
goto out;
+ gfs2_trans_add_bh(ip->i_gl, dibh, 1);
+
if (gfs2_is_stuffed(ip)) {
- u64 dsize = size + sizeof(struct gfs2_dinode);
- ip->i_disksize = size;
- ip->i_inode.i_mtime = ip->i_inode.i_ctime = CURRENT_TIME;
- gfs2_trans_add_bh(ip->i_gl, dibh, 1);
- gfs2_dinode_out(ip, dibh->b_data);
- if (dsize > dibh->b_size)
- dsize = dibh->b_size;
- gfs2_buffer_clear_tail(dibh, dsize);
- error = 1;
+ gfs2_buffer_clear_tail(dibh, sizeof(struct gfs2_dinode) + newsize);
} else {
- if (size & (u64)(sdp->sd_sb.sb_bsize - 1))
- error = gfs2_block_truncate_page(ip->i_inode.i_mapping);
-
- if (!error) {
- ip->i_disksize = size;
- ip->i_inode.i_mtime = ip->i_inode.i_ctime = CURRENT_TIME;
- ip->i_diskflags |= GFS2_DIF_TRUNC_IN_PROG;
- gfs2_trans_add_bh(ip->i_gl, dibh, 1);
- gfs2_dinode_out(ip, dibh->b_data);
+ if (newsize & (u64)(sdp->sd_sb.sb_bsize - 1)) {
+ error = gfs2_block_truncate_page(mapping, newsize);
+ if (error)
+ goto out_brelse;
}
+ ip->i_diskflags |= GFS2_DIF_TRUNC_IN_PROG;
}
- brelse(dibh);
+ i_size_write(inode, newsize);
+ ip->i_disksize = newsize;
+ ip->i_inode.i_mtime = ip->i_inode.i_ctime = CURRENT_TIME;
+ gfs2_dinode_out(ip, dibh->b_data);
+ truncate_pagecache(inode, oldsize, newsize);
+out_brelse:
+ brelse(dibh);
out:
gfs2_trans_end(sdp);
return error;
@@ -1143,86 +1071,149 @@ out:
/**
* do_shrink - make a file smaller
- * @ip: the inode
- * @size: the size to make the file
- * @truncator: function to truncate the last partial block
+ * @inode: the inode
+ * @oldsize: the current inode size
+ * @newsize: the size to make the file
*
- * Called with an exclusive lock on @ip.
+ * Called with an exclusive lock on @inode. The @size must
+ * be equal to or smaller than the current inode size.
*
* Returns: errno
*/
-static int do_shrink(struct gfs2_inode *ip, u64 size)
+static int do_shrink(struct inode *inode, u64 oldsize, u64 newsize)
{
+ struct gfs2_inode *ip = GFS2_I(inode);
int error;
- error = trunc_start(ip, size);
+ error = trunc_start(inode, oldsize, newsize);
if (error < 0)
return error;
- if (error > 0)
+ if (gfs2_is_stuffed(ip))
return 0;
- error = trunc_dealloc(ip, size);
- if (!error)
+ error = trunc_dealloc(ip, newsize);
+ if (error == 0)
error = trunc_end(ip);
return error;
}
-static int do_touch(struct gfs2_inode *ip, u64 size)
+void gfs2_trim_blocks(struct inode *inode)
{
- struct gfs2_sbd *sdp = GFS2_SB(&ip->i_inode);
+ u64 size = inode->i_size;
+ int ret;
+
+ ret = do_shrink(inode, size, size);
+ WARN_ON(ret != 0);
+}
+
+/**
+ * do_grow - Touch and update inode size
+ * @inode: The inode
+ * @size: The new size
+ *
+ * This function updates the timestamps on the inode and
+ * may also increase the size of the inode. This function
+ * must not be called with @size any smaller than the current
+ * inode size.
+ *
+ * Although it is not strictly required to unstuff files here,
+ * earlier versions of GFS2 have a bug in the stuffed file reading
+ * code which will result in a buffer overrun if the size is larger
+ * than the max stuffed file size. In order to prevent this from
+ * occuring, such files are unstuffed, but in other cases we can
+ * just update the inode size directly.
+ *
+ * Returns: 0 on success, or -ve on error
+ */
+
+static int do_grow(struct inode *inode, u64 size)
+{
+ struct gfs2_inode *ip = GFS2_I(inode);
+ struct gfs2_sbd *sdp = GFS2_SB(inode);
struct buffer_head *dibh;
+ struct gfs2_alloc *al = NULL;
int error;
- error = gfs2_trans_begin(sdp, RES_DINODE, 0);
+ if (gfs2_is_stuffed(ip) &&
+ (size > (sdp->sd_sb.sb_bsize - sizeof(struct gfs2_dinode)))) {
+ al = gfs2_alloc_get(ip);
+ if (al == NULL)
+ return -ENOMEM;
+
+ error = gfs2_quota_lock_check(ip);
+ if (error)
+ goto do_grow_alloc_put;
+
+ al->al_requested = 1;
+ error = gfs2_inplace_reserve(ip);
+ if (error)
+ goto do_grow_qunlock;
+ }
+
+ error = gfs2_trans_begin(sdp, RES_DINODE + 1, 0);
if (error)
- return error;
+ goto do_grow_release;
- down_write(&ip->i_rw_mutex);
+ if (al) {
+ error = gfs2_unstuff_dinode(ip, NULL);
+ if (error)
+ goto do_end_trans;
+ }
error = gfs2_meta_inode_buffer(ip, &dibh);
if (error)
- goto do_touch_out;
+ goto do_end_trans;
+ i_size_write(inode, size);
+ ip->i_disksize = size;
ip->i_inode.i_mtime = ip->i_inode.i_ctime = CURRENT_TIME;
gfs2_trans_add_bh(ip->i_gl, dibh, 1);
gfs2_dinode_out(ip, dibh->b_data);
brelse(dibh);
-do_touch_out:
- up_write(&ip->i_rw_mutex);
+do_end_trans:
gfs2_trans_end(sdp);
+do_grow_release:
+ if (al) {
+ gfs2_inplace_release(ip);
+do_grow_qunlock:
+ gfs2_quota_unlock(ip);
+do_grow_alloc_put:
+ gfs2_alloc_put(ip);
+ }
return error;
}
/**
- * gfs2_truncatei - make a file a given size
- * @ip: the inode
- * @size: the size to make the file
- * @truncator: function to truncate the last partial block
+ * gfs2_setattr_size - make a file a given size
+ * @inode: the inode
+ * @newsize: the size to make the file
*
- * The file size can grow, shrink, or stay the same size.
+ * The file size can grow, shrink, or stay the same size. This
+ * is called holding i_mutex and an exclusive glock on the inode
+ * in question.
*
* Returns: errno
*/
-int gfs2_truncatei(struct gfs2_inode *ip, u64 size)
+int gfs2_setattr_size(struct inode *inode, u64 newsize)
{
- int error;
+ int ret;
+ u64 oldsize;
- if (gfs2_assert_warn(GFS2_SB(&ip->i_inode), S_ISREG(ip->i_inode.i_mode)))
- return -EINVAL;
+ BUG_ON(!S_ISREG(inode->i_mode));
- if (size > ip->i_disksize)
- error = do_grow(ip, size);
- else if (size < ip->i_disksize)
- error = do_shrink(ip, size);
- else
- /* update time stamps */
- error = do_touch(ip, size);
+ ret = inode_newsize_ok(inode, newsize);
+ if (ret)
+ return ret;
- return error;
+ oldsize = inode->i_size;
+ if (newsize >= oldsize)
+ return do_grow(inode, newsize);
+
+ return do_shrink(inode, oldsize, newsize);
}
int gfs2_truncatei_resume(struct gfs2_inode *ip)
diff --git a/fs/gfs2/bmap.h b/fs/gfs2/bmap.h
index a20a521..42fea03 100644
--- a/fs/gfs2/bmap.h
+++ b/fs/gfs2/bmap.h
@@ -44,14 +44,16 @@ static inline void gfs2_write_calc_reserv(const struct gfs2_inode *ip,
}
}
-int gfs2_unstuff_dinode(struct gfs2_inode *ip, struct page *page);
-int gfs2_block_map(struct inode *inode, sector_t lblock, struct buffer_head *bh, int create);
-int gfs2_extent_map(struct inode *inode, u64 lblock, int *new, u64 *dblock, unsigned *extlen);
-
-int gfs2_truncatei(struct gfs2_inode *ip, u64 size);
-int gfs2_truncatei_resume(struct gfs2_inode *ip);
-int gfs2_file_dealloc(struct gfs2_inode *ip);
-int gfs2_write_alloc_required(struct gfs2_inode *ip, u64 offset,
- unsigned int len);
+extern int gfs2_unstuff_dinode(struct gfs2_inode *ip, struct page *page);
+extern int gfs2_block_map(struct inode *inode, sector_t lblock,
+ struct buffer_head *bh, int create);
+extern int gfs2_extent_map(struct inode *inode, u64 lblock, int *new,
+ u64 *dblock, unsigned *extlen);
+extern int gfs2_setattr_size(struct inode *inode, u64 size);
+extern void gfs2_trim_blocks(struct inode *inode);
+extern int gfs2_truncatei_resume(struct gfs2_inode *ip);
+extern int gfs2_file_dealloc(struct gfs2_inode *ip);
+extern int gfs2_write_alloc_required(struct gfs2_inode *ip, u64 offset,
+ unsigned int len);
#endif /* __BMAP_DOT_H__ */
diff --git a/fs/gfs2/ops_inode.c b/fs/gfs2/ops_inode.c
index 1009be2..1d3f2fb 100644
--- a/fs/gfs2/ops_inode.c
+++ b/fs/gfs2/ops_inode.c
@@ -1071,30 +1071,6 @@ int gfs2_permission(struct inode *inode, int mask)
return error;
}
-/*
- * XXX(truncate): the truncate_setsize calls should be moved to the end.
- */
-static int setattr_size(struct inode *inode, struct iattr *attr)
-{
- struct gfs2_inode *ip = GFS2_I(inode);
- struct gfs2_sbd *sdp = GFS2_SB(inode);
- int error;
-
- if (attr->ia_size != ip->i_disksize) {
- error = gfs2_trans_begin(sdp, 0, sdp->sd_jdesc->jd_blocks);
- if (error)
- return error;
- truncate_setsize(inode, attr->ia_size);
- gfs2_trans_end(sdp);
- }
-
- error = gfs2_truncatei(ip, attr->ia_size);
- if (error && (inode->i_size != ip->i_disksize))
- i_size_write(inode, ip->i_disksize);
-
- return error;
-}
-
static int setattr_chown(struct inode *inode, struct iattr *attr)
{
struct gfs2_inode *ip = GFS2_I(inode);
@@ -1195,7 +1171,7 @@ static int gfs2_setattr(struct dentry *dentry, struct iattr *attr)
goto out;
if (attr->ia_valid & ATTR_SIZE)
- error = setattr_size(inode, attr);
+ error = gfs2_setattr_size(inode, attr->ia_size);
else if (attr->ia_valid & (ATTR_UID | ATTR_GID))
error = setattr_chown(inode, attr);
else if ((attr->ia_valid & ATTR_MODE) && IS_POSIXACL(inode))
--
1.7.1.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [Cluster-devel] [PATCH 02/22] GFS2: Remove i_disksize
2010-10-18 14:15 [Cluster-devel] GFS2: Pre-pull patch posting Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 01/22] GFS2: New truncate sequence Steven Whitehouse
@ 2010-10-18 14:15 ` Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 03/22] GFS2: No longer experimental Steven Whitehouse
` (19 subsequent siblings)
21 siblings, 0 replies; 38+ messages in thread
From: Steven Whitehouse @ 2010-10-18 14:15 UTC (permalink / raw)
To: cluster-devel.redhat.com
With the update of the truncate code, ip->i_disksize and
inode->i_size are merely copies of each other. This means
we can remove ip->i_disksize and use inode->i_size exclusively
reducing the size of a GFS2 inode by 8 bytes.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c
index f687f25..c92f36b 100644
--- a/fs/gfs2/aops.c
+++ b/fs/gfs2/aops.c
@@ -800,10 +800,8 @@ static int gfs2_stuffed_write_end(struct inode *inode, struct buffer_head *dibh,
page_cache_release(page);
if (copied) {
- if (inode->i_size < to) {
+ if (inode->i_size < to)
i_size_write(inode, to);
- ip->i_disksize = inode->i_size;
- }
gfs2_dinode_out(ip, di);
mark_inode_dirty(inode);
}
@@ -874,8 +872,6 @@ static int gfs2_write_end(struct file *file, struct address_space *mapping,
ret = generic_write_end(file, mapping, pos, len, copied, page, fsdata);
if (ret > 0) {
- if (inode->i_size > ip->i_disksize)
- ip->i_disksize = inode->i_size;
gfs2_dinode_out(ip, dibh->b_data);
mark_inode_dirty(inode);
}
diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
index 20b971a..04513e9 100644
--- a/fs/gfs2/bmap.c
+++ b/fs/gfs2/bmap.c
@@ -131,7 +131,7 @@ int gfs2_unstuff_dinode(struct gfs2_inode *ip, struct page *page)
if (error)
goto out;
- if (ip->i_disksize) {
+ if (i_size_read(&ip->i_inode)) {
/* Get a free block, fill it with the stuffed data,
and write it out to disk */
@@ -160,7 +160,7 @@ int gfs2_unstuff_dinode(struct gfs2_inode *ip, struct page *page)
di = (struct gfs2_dinode *)dibh->b_data;
gfs2_buffer_clear_tail(dibh, sizeof(struct gfs2_dinode));
- if (ip->i_disksize) {
+ if (i_size_read(&ip->i_inode)) {
*(__be64 *)(di + 1) = cpu_to_be64(block);
gfs2_add_inode_blocks(&ip->i_inode, 1);
di->di_blocks = cpu_to_be64(gfs2_get_inode_blocks(&ip->i_inode));
@@ -985,7 +985,6 @@ static int trunc_start(struct inode *inode, u64 oldsize, u64 newsize)
}
i_size_write(inode, newsize);
- ip->i_disksize = newsize;
ip->i_inode.i_mtime = ip->i_inode.i_ctime = CURRENT_TIME;
gfs2_dinode_out(ip, dibh->b_data);
@@ -1051,7 +1050,7 @@ static int trunc_end(struct gfs2_inode *ip)
if (error)
goto out;
- if (!ip->i_disksize) {
+ if (!i_size_read(&ip->i_inode)) {
ip->i_height = 0;
ip->i_goal = ip->i_no_addr;
gfs2_buffer_clear_tail(dibh, sizeof(struct gfs2_dinode));
@@ -1167,7 +1166,6 @@ static int do_grow(struct inode *inode, u64 size)
goto do_end_trans;
i_size_write(inode, size);
- ip->i_disksize = size;
ip->i_inode.i_mtime = ip->i_inode.i_ctime = CURRENT_TIME;
gfs2_trans_add_bh(ip->i_gl, dibh, 1);
gfs2_dinode_out(ip, dibh->b_data);
@@ -1219,7 +1217,7 @@ int gfs2_setattr_size(struct inode *inode, u64 newsize)
int gfs2_truncatei_resume(struct gfs2_inode *ip)
{
int error;
- error = trunc_dealloc(ip, ip->i_disksize);
+ error = trunc_dealloc(ip, i_size_read(&ip->i_inode));
if (!error)
error = trunc_end(ip);
return error;
@@ -1260,7 +1258,7 @@ int gfs2_write_alloc_required(struct gfs2_inode *ip, u64 offset,
shift = sdp->sd_sb.sb_bsize_shift;
BUG_ON(gfs2_is_dir(ip));
- end_of_file = (ip->i_disksize + sdp->sd_sb.sb_bsize - 1) >> shift;
+ end_of_file = (i_size_read(&ip->i_inode) + sdp->sd_sb.sb_bsize - 1) >> shift;
lblock = offset >> shift;
lblock_stop = (offset + len + sdp->sd_sb.sb_bsize - 1) >> shift;
if (lblock_stop > end_of_file)
diff --git a/fs/gfs2/dir.c b/fs/gfs2/dir.c
index b9dd88a..c1042ae 100644
--- a/fs/gfs2/dir.c
+++ b/fs/gfs2/dir.c
@@ -127,8 +127,8 @@ static int gfs2_dir_write_stuffed(struct gfs2_inode *ip, const char *buf,
gfs2_trans_add_bh(ip->i_gl, dibh, 1);
memcpy(dibh->b_data + offset + sizeof(struct gfs2_dinode), buf, size);
- if (ip->i_disksize < offset + size)
- ip->i_disksize = offset + size;
+ if (ip->i_inode.i_size < offset + size)
+ i_size_write(&ip->i_inode, offset + size);
ip->i_inode.i_mtime = ip->i_inode.i_ctime = CURRENT_TIME;
gfs2_dinode_out(ip, dibh->b_data);
@@ -225,8 +225,8 @@ out:
if (error)
return error;
- if (ip->i_disksize < offset + copied)
- ip->i_disksize = offset + copied;
+ if (ip->i_inode.i_size < offset + copied)
+ i_size_write(&ip->i_inode, offset + copied);
ip->i_inode.i_mtime = ip->i_inode.i_ctime = CURRENT_TIME;
gfs2_trans_add_bh(ip->i_gl, dibh, 1);
@@ -275,12 +275,13 @@ static int gfs2_dir_read_data(struct gfs2_inode *ip, char *buf, u64 offset,
unsigned int o;
int copied = 0;
int error = 0;
+ u64 disksize = i_size_read(&ip->i_inode);
- if (offset >= ip->i_disksize)
+ if (offset >= disksize)
return 0;
- if (offset + size > ip->i_disksize)
- size = ip->i_disksize - offset;
+ if (offset + size > disksize)
+ size = disksize - offset;
if (!size)
return 0;
@@ -727,7 +728,7 @@ static struct gfs2_dirent *gfs2_dirent_search(struct inode *inode,
unsigned hsize = 1 << ip->i_depth;
unsigned index;
u64 ln;
- if (hsize * sizeof(u64) != ip->i_disksize) {
+ if (hsize * sizeof(u64) != i_size_read(inode)) {
gfs2_consist_inode(ip);
return ERR_PTR(-EIO);
}
@@ -879,7 +880,7 @@ static int dir_make_exhash(struct inode *inode)
for (x = sdp->sd_hash_ptrs; x--; lp++)
*lp = cpu_to_be64(bn);
- dip->i_disksize = sdp->sd_sb.sb_bsize / 2;
+ i_size_write(inode, sdp->sd_sb.sb_bsize / 2);
gfs2_add_inode_blocks(&dip->i_inode, 1);
dip->i_diskflags |= GFS2_DIF_EXHASH;
@@ -1057,11 +1058,12 @@ static int dir_double_exhash(struct gfs2_inode *dip)
u64 *buf;
u64 *from, *to;
u64 block;
+ u64 disksize = i_size_read(&dip->i_inode);
int x;
int error = 0;
hsize = 1 << dip->i_depth;
- if (hsize * sizeof(u64) != dip->i_disksize) {
+ if (hsize * sizeof(u64) != disksize) {
gfs2_consist_inode(dip);
return -EIO;
}
@@ -1072,7 +1074,7 @@ static int dir_double_exhash(struct gfs2_inode *dip)
if (!buf)
return -ENOMEM;
- for (block = dip->i_disksize >> sdp->sd_hash_bsize_shift; block--;) {
+ for (block = disksize >> sdp->sd_hash_bsize_shift; block--;) {
error = gfs2_dir_read_data(dip, (char *)buf,
block * sdp->sd_hash_bsize,
sdp->sd_hash_bsize, 1);
@@ -1370,7 +1372,7 @@ static int dir_e_read(struct inode *inode, u64 *offset, void *opaque,
unsigned depth = 0;
hsize = 1 << dip->i_depth;
- if (hsize * sizeof(u64) != dip->i_disksize) {
+ if (hsize * sizeof(u64) != i_size_read(inode)) {
gfs2_consist_inode(dip);
return -EIO;
}
@@ -1784,7 +1786,7 @@ static int foreach_leaf(struct gfs2_inode *dip, leaf_call_t lc, void *data)
int error = 0;
hsize = 1 << dip->i_depth;
- if (hsize * sizeof(u64) != dip->i_disksize) {
+ if (hsize * sizeof(u64) != i_size_read(&dip->i_inode)) {
gfs2_consist_inode(dip);
return -EIO;
}
diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c
index 4edd662..daadcd2 100644
--- a/fs/gfs2/file.c
+++ b/fs/gfs2/file.c
@@ -491,7 +491,7 @@ static int gfs2_open(struct inode *inode, struct file *file)
goto fail;
if (!(file->f_flags & O_LARGEFILE) &&
- ip->i_disksize > MAX_NON_LFS) {
+ i_size_read(inode) > MAX_NON_LFS) {
error = -EOVERFLOW;
goto fail_gunlock;
}
diff --git a/fs/gfs2/glops.c b/fs/gfs2/glops.c
index 49f97d3..621d80e 100644
--- a/fs/gfs2/glops.c
+++ b/fs/gfs2/glops.c
@@ -262,13 +262,12 @@ static int inode_go_dump(struct seq_file *seq, const struct gfs2_glock *gl)
const struct gfs2_inode *ip = gl->gl_object;
if (ip == NULL)
return 0;
- gfs2_print_dbg(seq, " I: n:%llu/%llu t:%u f:0x%02lx d:0x%08x s:%llu/%llu\n",
+ gfs2_print_dbg(seq, " I: n:%llu/%llu t:%u f:0x%02lx d:0x%08x s:%llu\n",
(unsigned long long)ip->i_no_formal_ino,
(unsigned long long)ip->i_no_addr,
IF2DT(ip->i_inode.i_mode), ip->i_flags,
(unsigned int)ip->i_diskflags,
- (unsigned long long)ip->i_inode.i_size,
- (unsigned long long)ip->i_disksize);
+ (unsigned long long)i_size_read(&ip->i_inode));
return 0;
}
diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
index fdbf4b3..c119717 100644
--- a/fs/gfs2/incore.h
+++ b/fs/gfs2/incore.h
@@ -267,7 +267,6 @@ struct gfs2_inode {
u64 i_no_formal_ino;
u64 i_generation;
u64 i_eattr;
- loff_t i_disksize;
unsigned long i_flags; /* GIF_... */
struct gfs2_glock *i_gl; /* Move into i_gh? */
struct gfs2_holder i_iopen_gh;
diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index 08140f1..06370f8 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -359,8 +359,7 @@ static int gfs2_dinode_in(struct gfs2_inode *ip, const void *buf)
* to do that.
*/
ip->i_inode.i_nlink = be32_to_cpu(str->di_nlink);
- ip->i_disksize = be64_to_cpu(str->di_size);
- i_size_write(&ip->i_inode, ip->i_disksize);
+ i_size_write(&ip->i_inode, be64_to_cpu(str->di_size));
gfs2_set_inode_blocks(&ip->i_inode, be64_to_cpu(str->di_blocks));
atime.tv_sec = be64_to_cpu(str->di_atime);
atime.tv_nsec = be32_to_cpu(str->di_atime_nsec);
@@ -1055,7 +1054,7 @@ void gfs2_dinode_out(const struct gfs2_inode *ip, void *buf)
str->di_uid = cpu_to_be32(ip->i_inode.i_uid);
str->di_gid = cpu_to_be32(ip->i_inode.i_gid);
str->di_nlink = cpu_to_be32(ip->i_inode.i_nlink);
- str->di_size = cpu_to_be64(ip->i_disksize);
+ str->di_size = cpu_to_be64(i_size_read(&ip->i_inode));
str->di_blocks = cpu_to_be64(gfs2_get_inode_blocks(&ip->i_inode));
str->di_atime = cpu_to_be64(ip->i_inode.i_atime.tv_sec);
str->di_mtime = cpu_to_be64(ip->i_inode.i_mtime.tv_sec);
@@ -1085,8 +1084,8 @@ void gfs2_dinode_print(const struct gfs2_inode *ip)
(unsigned long long)ip->i_no_formal_ino);
printk(KERN_INFO " no_addr = %llu\n",
(unsigned long long)ip->i_no_addr);
- printk(KERN_INFO " i_disksize = %llu\n",
- (unsigned long long)ip->i_disksize);
+ printk(KERN_INFO " i_size = %llu\n",
+ (unsigned long long)i_size_read(&ip->i_inode));
printk(KERN_INFO " blocks = %llu\n",
(unsigned long long)gfs2_get_inode_blocks(&ip->i_inode));
printk(KERN_INFO " i_goal = %llu\n",
diff --git a/fs/gfs2/inode.h b/fs/gfs2/inode.h
index 300ada3..15ff4df 100644
--- a/fs/gfs2/inode.h
+++ b/fs/gfs2/inode.h
@@ -80,6 +80,19 @@ static inline void gfs2_inum_out(const struct gfs2_inode *ip,
dent->de_inum.no_addr = cpu_to_be64(ip->i_no_addr);
}
+static inline int gfs2_check_internal_file_size(struct inode *inode,
+ u64 minsize, u64 maxsize)
+{
+ u64 size = i_size_read(inode);
+ if (size < minsize || size > maxsize)
+ goto err;
+ if (size & ((1 << inode->i_blkbits) - 1))
+ goto err;
+ return 0;
+err:
+ gfs2_consist_inode(GFS2_I(inode));
+ return -EIO;
+}
extern void gfs2_set_iop(struct inode *inode);
extern struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned type,
diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index 4d4b1e8..5b5c87d 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -586,7 +586,7 @@ static int map_journal_extents(struct gfs2_sbd *sdp)
prev_db = 0;
- for (lb = 0; lb < ip->i_disksize >> sdp->sd_sb.sb_bsize_shift; lb++) {
+ for (lb = 0; lb < i_size_read(jd->jd_inode) >> sdp->sd_sb.sb_bsize_shift; lb++) {
bh.b_state = 0;
bh.b_blocknr = 0;
bh.b_size = 1 << ip->i_inode.i_blkbits;
diff --git a/fs/gfs2/ops_inode.c b/fs/gfs2/ops_inode.c
index 1d3f2fb..ee6ffd5 100644
--- a/fs/gfs2/ops_inode.c
+++ b/fs/gfs2/ops_inode.c
@@ -406,7 +406,6 @@ static int gfs2_symlink(struct inode *dir, struct dentry *dentry,
ip = ghs[1].gh_gl->gl_object;
- ip->i_disksize = size;
i_size_write(inode, size);
error = gfs2_meta_inode_buffer(ip, &dibh);
@@ -461,7 +460,7 @@ static int gfs2_mkdir(struct inode *dir, struct dentry *dentry, int mode)
ip = ghs[1].gh_gl->gl_object;
ip->i_inode.i_nlink = 2;
- ip->i_disksize = sdp->sd_sb.sb_bsize - sizeof(struct gfs2_dinode);
+ i_size_write(inode, sdp->sd_sb.sb_bsize - sizeof(struct gfs2_dinode));
ip->i_diskflags |= GFS2_DIF_JDATA;
ip->i_entries = 2;
@@ -990,7 +989,7 @@ static void *gfs2_follow_link(struct dentry *dentry, struct nameidata *nd)
struct gfs2_inode *ip = GFS2_I(dentry->d_inode);
struct gfs2_holder i_gh;
struct buffer_head *dibh;
- unsigned int x;
+ unsigned int x, size;
char *buf;
int error;
@@ -1002,7 +1001,8 @@ static void *gfs2_follow_link(struct dentry *dentry, struct nameidata *nd)
return NULL;
}
- if (!ip->i_disksize) {
+ size = (unsigned int)i_size_read(&ip->i_inode);
+ if (size == 0) {
gfs2_consist_inode(ip);
buf = ERR_PTR(-EIO);
goto out;
@@ -1014,7 +1014,7 @@ static void *gfs2_follow_link(struct dentry *dentry, struct nameidata *nd)
goto out;
}
- x = ip->i_disksize + 1;
+ x = size + 1;
buf = kmalloc(x, GFP_NOFS);
if (!buf)
buf = ERR_PTR(-ENOMEM);
diff --git a/fs/gfs2/quota.c b/fs/gfs2/quota.c
index 1bc6b56..9bc6dd9 100644
--- a/fs/gfs2/quota.c
+++ b/fs/gfs2/quota.c
@@ -735,10 +735,8 @@ get_a_page:
goto out;
size = loc + sizeof(struct gfs2_quota);
- if (size > inode->i_size) {
- ip->i_disksize = size;
+ if (size > inode->i_size)
i_size_write(inode, size);
- }
inode->i_mtime = inode->i_atime = CURRENT_TIME;
gfs2_trans_add_bh(ip->i_gl, dibh, 1);
gfs2_dinode_out(ip, dibh->b_data);
@@ -1190,18 +1188,17 @@ static void gfs2_quota_change_in(struct gfs2_quota_change_host *qc, const void *
int gfs2_quota_init(struct gfs2_sbd *sdp)
{
struct gfs2_inode *ip = GFS2_I(sdp->sd_qc_inode);
- unsigned int blocks = ip->i_disksize >> sdp->sd_sb.sb_bsize_shift;
+ u64 size = i_size_read(sdp->sd_qc_inode);
+ unsigned int blocks = size >> sdp->sd_sb.sb_bsize_shift;
unsigned int x, slot = 0;
unsigned int found = 0;
u64 dblock;
u32 extlen = 0;
int error;
- if (!ip->i_disksize || ip->i_disksize > (64 << 20) ||
- ip->i_disksize & (sdp->sd_sb.sb_bsize - 1)) {
- gfs2_consist_inode(ip);
+ if (gfs2_check_internal_file_size(sdp->sd_qc_inode, 1, 64 << 20))
return -EIO;
- }
+
sdp->sd_quota_slots = blocks * sdp->sd_qc_per_block;
sdp->sd_quota_chunks = DIV_ROUND_UP(sdp->sd_quota_slots, 8 * PAGE_SIZE);
diff --git a/fs/gfs2/rgrp.c b/fs/gfs2/rgrp.c
index 171a744..370c29b 100644
--- a/fs/gfs2/rgrp.c
+++ b/fs/gfs2/rgrp.c
@@ -500,7 +500,7 @@ u64 gfs2_ri_total(struct gfs2_sbd *sdp)
for (rgrps = 0;; rgrps++) {
loff_t pos = rgrps * sizeof(struct gfs2_rindex);
- if (pos + sizeof(struct gfs2_rindex) >= ip->i_disksize)
+ if (pos + sizeof(struct gfs2_rindex) >= i_size_read(inode))
break;
error = gfs2_internal_read(ip, &ra_state, buf, &pos,
sizeof(struct gfs2_rindex));
@@ -588,7 +588,7 @@ static int gfs2_ri_update(struct gfs2_inode *ip)
struct gfs2_sbd *sdp = GFS2_SB(&ip->i_inode);
struct inode *inode = &ip->i_inode;
struct file_ra_state ra_state;
- u64 rgrp_count = ip->i_disksize;
+ u64 rgrp_count = i_size_read(inode);
int error;
do_div(rgrp_count, sizeof(struct gfs2_rindex));
@@ -628,7 +628,7 @@ static int gfs2_ri_update_special(struct gfs2_inode *ip)
for (sdp->sd_rgrps = 0;; sdp->sd_rgrps++) {
/* Ignore partials */
if ((sdp->sd_rgrps + 1) * sizeof(struct gfs2_rindex) >
- ip->i_disksize)
+ i_size_read(inode))
break;
error = read_rindex_entry(ip, &ra_state);
if (error) {
diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index 77cb9f8..e031fa4 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -342,15 +342,14 @@ int gfs2_jdesc_check(struct gfs2_jdesc *jd)
{
struct gfs2_inode *ip = GFS2_I(jd->jd_inode);
struct gfs2_sbd *sdp = GFS2_SB(jd->jd_inode);
+ u64 size = i_size_read(jd->jd_inode);
- if (ip->i_disksize < (8 << 20) || ip->i_disksize > (1 << 30) ||
- (ip->i_disksize & (sdp->sd_sb.sb_bsize - 1))) {
- gfs2_consist_inode(ip);
+ if (gfs2_check_internal_file_size(jd->jd_inode, 8 << 20, 1 << 30))
return -EIO;
- }
- jd->jd_blocks = ip->i_disksize >> sdp->sd_sb.sb_bsize_shift;
- if (gfs2_write_alloc_required(ip, 0, ip->i_disksize)) {
+ jd->jd_blocks = size >> sdp->sd_sb.sb_bsize_shift;
+
+ if (gfs2_write_alloc_required(ip, 0, size)) {
gfs2_consist_inode(ip);
return -EIO;
}
--
1.7.1.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [Cluster-devel] [PATCH 03/22] GFS2: No longer experimental
2010-10-18 14:15 [Cluster-devel] GFS2: Pre-pull patch posting Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 01/22] GFS2: New truncate sequence Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 02/22] GFS2: Remove i_disksize Steven Whitehouse
@ 2010-10-18 14:15 ` Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 04/22] GFS2: Add a bug trap in allocation code Steven Whitehouse
` (18 subsequent siblings)
21 siblings, 0 replies; 38+ messages in thread
From: Steven Whitehouse @ 2010-10-18 14:15 UTC (permalink / raw)
To: cluster-devel.redhat.com
I think the time has arrvied to remove the experimental tag
from GFS2.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
diff --git a/fs/gfs2/Kconfig b/fs/gfs2/Kconfig
index cc96655..c465ae0 100644
--- a/fs/gfs2/Kconfig
+++ b/fs/gfs2/Kconfig
@@ -1,6 +1,6 @@
config GFS2_FS
tristate "GFS2 file system support"
- depends on EXPERIMENTAL && (64BIT || LBDAF)
+ depends on (64BIT || LBDAF)
select DLM if GFS2_FS_LOCKING_DLM
select CONFIGFS_FS if GFS2_FS_LOCKING_DLM
select SYSFS if GFS2_FS_LOCKING_DLM
--
1.7.1.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [Cluster-devel] [PATCH 04/22] GFS2: Add a bug trap in allocation code
2010-10-18 14:15 [Cluster-devel] GFS2: Pre-pull patch posting Steven Whitehouse
` (2 preceding siblings ...)
2010-10-18 14:15 ` [Cluster-devel] [PATCH 03/22] GFS2: No longer experimental Steven Whitehouse
@ 2010-10-18 14:15 ` Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 05/22] GFS2: fallocate support Steven Whitehouse
` (17 subsequent siblings)
21 siblings, 0 replies; 38+ messages in thread
From: Steven Whitehouse @ 2010-10-18 14:15 UTC (permalink / raw)
To: cluster-devel.redhat.com
This adds a check to ensure that if we reach the block allocator
that we don't try and proceed if there is no alloc structure
hanging off the inode. This should only happen if there is a bug
in GFS2. The error return code is distinctive in order that it
will be easily spotted.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
diff --git a/fs/gfs2/rgrp.c b/fs/gfs2/rgrp.c
index 370c29b..66b6d4d 100644
--- a/fs/gfs2/rgrp.c
+++ b/fs/gfs2/rgrp.c
@@ -1496,11 +1496,19 @@ int gfs2_alloc_block(struct gfs2_inode *ip, u64 *bn, unsigned int *n)
struct gfs2_sbd *sdp = GFS2_SB(&ip->i_inode);
struct buffer_head *dibh;
struct gfs2_alloc *al = ip->i_alloc;
- struct gfs2_rgrpd *rgd = al->al_rgd;
+ struct gfs2_rgrpd *rgd;
u32 goal, blk;
u64 block;
int error;
+ /* Only happens if there is a bug in gfs2, return something distinctive
+ * to ensure that it is noticed.
+ */
+ if (al == NULL)
+ return -ECANCELED;
+
+ rgd = al->al_rgd;
+
if (rgrp_contains_block(rgd, ip->i_goal))
goal = ip->i_goal - rgd->rd_data0;
else
--
1.7.1.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [Cluster-devel] [PATCH 05/22] GFS2: fallocate support
2010-10-18 14:15 [Cluster-devel] GFS2: Pre-pull patch posting Steven Whitehouse
` (3 preceding siblings ...)
2010-10-18 14:15 ` [Cluster-devel] [PATCH 04/22] GFS2: Add a bug trap in allocation code Steven Whitehouse
@ 2010-10-18 14:15 ` Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 06/22] GFS2: Fix whitespace in previous patch Steven Whitehouse
` (16 subsequent siblings)
21 siblings, 0 replies; 38+ messages in thread
From: Steven Whitehouse @ 2010-10-18 14:15 UTC (permalink / raw)
To: cluster-devel.redhat.com
From: Benjamin Marzinski <bmarzins@redhat.com>
This patch adds support for fallocate to gfs2. Since the gfs2 does not support
uninitialized data blocks, it must write out zeros to all the blocks. However,
since it does not need to lock any pages to read from, gfs2 can write out the
zero blocks much more efficiently. On a moderately full filesystem, fallocate
works around 5 times faster on average. The fallocate call also allows gfs2 to
add blocks to the file without changing the filesize, which will make it
possible for gfs2 to preallocate space for the rindex file, so that gfs2 can
grow a completely full filesystem.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c
index c92f36b..180ef8a 100644
--- a/fs/gfs2/aops.c
+++ b/fs/gfs2/aops.c
@@ -36,8 +36,8 @@
#include "glops.h"
-static void gfs2_page_add_databufs(struct gfs2_inode *ip, struct page *page,
- unsigned int from, unsigned int to)
+void gfs2_page_add_databufs(struct gfs2_inode *ip, struct page *page,
+ unsigned int from, unsigned int to)
{
struct buffer_head *head = page_buffers(page);
unsigned int bsize = head->b_size;
diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
index c119717..578234b 100644
--- a/fs/gfs2/incore.h
+++ b/fs/gfs2/incore.h
@@ -571,6 +571,7 @@ struct gfs2_sbd {
struct list_head sd_rindex_mru_list;
struct gfs2_rgrpd *sd_rindex_forward;
unsigned int sd_rgrps;
+ unsigned int sd_max_rg_data;
/* Journal index stuff */
diff --git a/fs/gfs2/inode.h b/fs/gfs2/inode.h
index 15ff4df..6720d7d 100644
--- a/fs/gfs2/inode.h
+++ b/fs/gfs2/inode.h
@@ -19,6 +19,8 @@ extern int gfs2_releasepage(struct page *page, gfp_t gfp_mask);
extern int gfs2_internal_read(struct gfs2_inode *ip,
struct file_ra_state *ra_state,
char *buf, loff_t *pos, unsigned size);
+extern void gfs2_page_add_databufs(struct gfs2_inode *ip, struct page *page,
+ unsigned int from, unsigned int to);
extern void gfs2_set_aops(struct inode *inode);
static inline int gfs2_is_stuffed(const struct gfs2_inode *ip)
diff --git a/fs/gfs2/ops_inode.c b/fs/gfs2/ops_inode.c
index ee6ffd5..f6da0d7 100644
--- a/fs/gfs2/ops_inode.c
+++ b/fs/gfs2/ops_inode.c
@@ -18,6 +18,8 @@
#include <linux/gfs2_ondisk.h>
#include <linux/crc32.h>
#include <linux/fiemap.h>
+#include <linux/swap.h>
+#include <linux/falloc.h>
#include <asm/uaccess.h>
#include "gfs2.h"
@@ -1277,6 +1279,257 @@ static int gfs2_removexattr(struct dentry *dentry, const char *name)
return ret;
}
+static void empty_write_end(struct page *page, unsigned from,
+ unsigned to)
+{
+ struct gfs2_inode *ip = GFS2_I(page->mapping->host);
+
+ page_zero_new_buffers(page, from, to);
+ flush_dcache_page(page);
+ mark_page_accessed(page);
+
+ if (!gfs2_is_writeback(ip))
+ gfs2_page_add_databufs(ip, page, from, to);
+
+ block_commit_write(page, from, to);
+}
+
+
+static int write_empty_blocks(struct page *page, unsigned from, unsigned to)
+{
+ unsigned start, end, next;
+ struct buffer_head *bh, *head;
+ int error;
+
+ if (!page_has_buffers(page)) {
+ error = block_prepare_write(page, from, to, gfs2_block_map);
+ if (unlikely(error))
+ return error;
+
+ empty_write_end(page, from, to);
+ return 0;
+ }
+
+ bh = head = page_buffers(page);
+ next = end = 0;
+ while (next < from) {
+ next += bh->b_size;
+ bh = bh->b_this_page;
+ }
+ start = next;
+ do {
+ next += bh->b_size;
+ if (buffer_mapped(bh)) {
+ if (end) {
+ error = block_prepare_write(page, start, end,
+ gfs2_block_map);
+ if (unlikely(error))
+ return error;
+ empty_write_end(page, start, end);
+ end = 0;
+ }
+ start = next;
+ }
+ else
+ end = next;
+ bh = bh->b_this_page;
+ } while (next < to);
+
+ if (end) {
+ error = block_prepare_write(page, start, end, gfs2_block_map);
+ if (unlikely(error))
+ return error;
+ empty_write_end(page, start, end);
+ }
+
+ return 0;
+}
+
+static int fallocate_chunk(struct inode *inode, loff_t offset, loff_t len,
+ int mode)
+{
+ struct gfs2_inode *ip = GFS2_I(inode);
+ struct buffer_head *dibh;
+ int error;
+ u64 start = offset >> PAGE_CACHE_SHIFT;
+ unsigned int start_offset = offset & ~PAGE_CACHE_MASK;
+ u64 end = (offset + len - 1) >> PAGE_CACHE_SHIFT;
+ pgoff_t curr;
+ struct page *page;
+ unsigned int end_offset = (offset + len) & ~PAGE_CACHE_MASK;
+ unsigned int from, to;
+
+ if (!end_offset)
+ end_offset = PAGE_CACHE_SIZE;
+
+ error = gfs2_meta_inode_buffer(ip, &dibh);
+ if (unlikely(error))
+ goto out;
+
+ gfs2_trans_add_bh(ip->i_gl, dibh, 1);
+
+ if (gfs2_is_stuffed(ip)) {
+ error = gfs2_unstuff_dinode(ip, NULL);
+ if (unlikely(error))
+ goto out;
+ }
+
+ curr = start;
+ offset = start << PAGE_CACHE_SHIFT;
+ from = start_offset;
+ to = PAGE_CACHE_SIZE;
+ while (curr <= end) {
+ page = grab_cache_page_write_begin(inode->i_mapping, curr,
+ AOP_FLAG_NOFS);
+ if (unlikely(!page)) {
+ error = -ENOMEM;
+ goto out;
+ }
+
+ if (curr == end)
+ to = end_offset;
+ error = write_empty_blocks(page, from, to);
+ if (!error && offset + to > inode->i_size &&
+ !(mode & FALLOC_FL_KEEP_SIZE)) {
+ i_size_write(inode, offset + to);
+ }
+ unlock_page(page);
+ page_cache_release(page);
+ if (error)
+ goto out;
+ curr++;
+ offset += PAGE_CACHE_SIZE;
+ from = 0;
+ }
+
+ gfs2_dinode_out(ip, dibh->b_data);
+ mark_inode_dirty(inode);
+
+ brelse(dibh);
+
+out:
+ return error;
+}
+
+static void calc_max_reserv(struct gfs2_inode *ip, loff_t max, loff_t *len,
+ unsigned int *data_blocks, unsigned int *ind_blocks)
+{
+ const struct gfs2_sbd *sdp = GFS2_SB(&ip->i_inode);
+ unsigned int max_blocks = ip->i_alloc->al_rgd->rd_free_clone;
+ unsigned int tmp, max_data = max_blocks - 3 * (sdp->sd_max_height - 1);
+
+ for (tmp = max_data; tmp > sdp->sd_diptrs;) {
+ tmp = DIV_ROUND_UP(tmp, sdp->sd_inptrs);
+ max_data -= tmp;
+ }
+ /* This calculation isn't the exact reverse of gfs2_write_calc_reserve,
+ so it might end up with fewer data blocks */
+ if (max_data <= *data_blocks)
+ return;
+ *data_blocks = max_data;
+ *ind_blocks = max_blocks - max_data;
+ *len = ((loff_t)max_data - 3) << sdp->sd_sb.sb_bsize_shift;
+ if (*len > max) {
+ *len = max;
+ gfs2_write_calc_reserv(ip, max, data_blocks, ind_blocks);
+ }
+}
+
+static long gfs2_fallocate(struct inode *inode, int mode, loff_t offset,
+ loff_t len)
+{
+ struct gfs2_sbd *sdp = GFS2_SB(inode);
+ struct gfs2_inode *ip = GFS2_I(inode);
+ unsigned int data_blocks = 0, ind_blocks = 0, rblocks;
+ loff_t bytes, max_bytes;
+ struct gfs2_alloc *al;
+ int error;
+ loff_t next = (offset + len - 1) >> sdp->sd_sb.sb_bsize_shift;
+ next = (next + 1) << sdp->sd_sb.sb_bsize_shift;
+
+ offset = (offset >> sdp->sd_sb.sb_bsize_shift) <<
+ sdp->sd_sb.sb_bsize_shift;
+
+ len = next - offset;
+ bytes = sdp->sd_max_rg_data * sdp->sd_sb.sb_bsize / 2;
+ if (!bytes)
+ bytes = UINT_MAX;
+
+ gfs2_holder_init(ip->i_gl, LM_ST_EXCLUSIVE, 0, &ip->i_gh);
+ error = gfs2_glock_nq(&ip->i_gh);
+ if (unlikely(error))
+ goto out_uninit;
+
+ if (!gfs2_write_alloc_required(ip, offset, len))
+ goto out_unlock;
+
+ while (len > 0) {
+ if (len < bytes)
+ bytes = len;
+ al = gfs2_alloc_get(ip);
+ if (!al) {
+ error = -ENOMEM;
+ goto out_unlock;
+ }
+
+ error = gfs2_quota_lock_check(ip);
+ if (error)
+ goto out_alloc_put;
+
+retry:
+ gfs2_write_calc_reserv(ip, bytes, &data_blocks, &ind_blocks);
+
+ al->al_requested = data_blocks + ind_blocks;
+ error = gfs2_inplace_reserve(ip);
+ if (error) {
+ if (error == -ENOSPC && bytes > sdp->sd_sb.sb_bsize) {
+ bytes >>= 1;
+ goto retry;
+ }
+ goto out_qunlock;
+ }
+ max_bytes = bytes;
+ calc_max_reserv(ip, len, &max_bytes, &data_blocks, &ind_blocks);
+ al->al_requested = data_blocks + ind_blocks;
+
+ rblocks = RES_DINODE + ind_blocks + RES_STATFS + RES_QUOTA +
+ RES_RG_HDR + ip->i_alloc->al_rgd->rd_length;
+ if (gfs2_is_jdata(ip))
+ rblocks += data_blocks ? data_blocks : 1;
+
+ error = gfs2_trans_begin(sdp, rblocks,
+ PAGE_CACHE_SIZE/sdp->sd_sb.sb_bsize);
+ if (error)
+ goto out_trans_fail;
+
+ error = fallocate_chunk(inode, offset, max_bytes, mode);
+ gfs2_trans_end(sdp);
+
+ if (error)
+ goto out_trans_fail;
+
+ len -= max_bytes;
+ offset += max_bytes;
+ gfs2_inplace_release(ip);
+ gfs2_quota_unlock(ip);
+ gfs2_alloc_put(ip);
+ }
+ goto out_unlock;
+
+out_trans_fail:
+ gfs2_inplace_release(ip);
+out_qunlock:
+ gfs2_quota_unlock(ip);
+out_alloc_put:
+ gfs2_alloc_put(ip);
+out_unlock:
+ gfs2_glock_dq(&ip->i_gh);
+out_uninit:
+ gfs2_holder_uninit(&ip->i_gh);
+ return error;
+}
+
+
static int gfs2_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
u64 start, u64 len)
{
@@ -1327,6 +1580,7 @@ const struct inode_operations gfs2_file_iops = {
.getxattr = gfs2_getxattr,
.listxattr = gfs2_listxattr,
.removexattr = gfs2_removexattr,
+ .fallocate = gfs2_fallocate,
.fiemap = gfs2_fiemap,
};
diff --git a/fs/gfs2/rgrp.c b/fs/gfs2/rgrp.c
index 66b6d4d..f9ddcf4 100644
--- a/fs/gfs2/rgrp.c
+++ b/fs/gfs2/rgrp.c
@@ -589,6 +589,8 @@ static int gfs2_ri_update(struct gfs2_inode *ip)
struct inode *inode = &ip->i_inode;
struct file_ra_state ra_state;
u64 rgrp_count = i_size_read(inode);
+ struct gfs2_rgrpd *rgd;
+ unsigned int max_data = 0;
int error;
do_div(rgrp_count, sizeof(struct gfs2_rindex));
@@ -603,6 +605,10 @@ static int gfs2_ri_update(struct gfs2_inode *ip)
}
}
+ list_for_each_entry(rgd, &sdp->sd_rindex_list, rd_list)
+ if (rgd->rd_data > max_data)
+ max_data = rgd->rd_data;
+ sdp->sd_max_rg_data = max_data;
sdp->sd_rindex_uptodate = 1;
return 0;
}
@@ -622,6 +628,8 @@ static int gfs2_ri_update_special(struct gfs2_inode *ip)
struct gfs2_sbd *sdp = GFS2_SB(&ip->i_inode);
struct inode *inode = &ip->i_inode;
struct file_ra_state ra_state;
+ struct gfs2_rgrpd *rgd;
+ unsigned int max_data = 0;
int error;
file_ra_state_init(&ra_state, inode->i_mapping);
@@ -636,6 +644,10 @@ static int gfs2_ri_update_special(struct gfs2_inode *ip)
return error;
}
}
+ list_for_each_entry(rgd, &sdp->sd_rindex_list, rd_list)
+ if (rgd->rd_data > max_data)
+ max_data = rgd->rd_data;
+ sdp->sd_max_rg_data = max_data;
sdp->sd_rindex_uptodate = 1;
return 0;
diff --git a/fs/gfs2/trans.h b/fs/gfs2/trans.h
index edf9d4b..b849eb7 100644
--- a/fs/gfs2/trans.h
+++ b/fs/gfs2/trans.h
@@ -20,6 +20,7 @@ struct gfs2_glock;
#define RES_JDATA 1
#define RES_DATA 1
#define RES_LEAF 1
+#define RES_RG_HDR 1
#define RES_RG_BIT 2
#define RES_EATTR 1
#define RES_STATFS 1
--
1.7.1.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [Cluster-devel] [PATCH 06/22] GFS2: Fix whitespace in previous patch
2010-10-18 14:15 [Cluster-devel] GFS2: Pre-pull patch posting Steven Whitehouse
` (4 preceding siblings ...)
2010-10-18 14:15 ` [Cluster-devel] [PATCH 05/22] GFS2: fallocate support Steven Whitehouse
@ 2010-10-18 14:15 ` Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 07/22] GFS2: Don't enforce min hold time when two demotes occur in rapid succession Steven Whitehouse
` (15 subsequent siblings)
21 siblings, 0 replies; 38+ messages in thread
From: Steven Whitehouse @ 2010-10-18 14:15 UTC (permalink / raw)
To: cluster-devel.redhat.com
Removes the offending space
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
diff --git a/fs/gfs2/ops_inode.c b/fs/gfs2/ops_inode.c
index f6da0d7..ce4f1df 100644
--- a/fs/gfs2/ops_inode.c
+++ b/fs/gfs2/ops_inode.c
@@ -1423,7 +1423,7 @@ static void calc_max_reserv(struct gfs2_inode *ip, loff_t max, loff_t *len,
max_data -= tmp;
}
/* This calculation isn't the exact reverse of gfs2_write_calc_reserve,
- so it might end up with fewer data blocks */
+ so it might end up with fewer data blocks */
if (max_data <= *data_blocks)
return;
*data_blocks = max_data;
--
1.7.1.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [Cluster-devel] [PATCH 07/22] GFS2: Don't enforce min hold time when two demotes occur in rapid succession
2010-10-18 14:15 [Cluster-devel] GFS2: Pre-pull patch posting Steven Whitehouse
` (5 preceding siblings ...)
2010-10-18 14:15 ` [Cluster-devel] [PATCH 06/22] GFS2: Fix whitespace in previous patch Steven Whitehouse
@ 2010-10-18 14:15 ` Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 08/22] GFS2: Update handling of DLM return codes to match reality Steven Whitehouse
` (14 subsequent siblings)
21 siblings, 0 replies; 38+ messages in thread
From: Steven Whitehouse @ 2010-10-18 14:15 UTC (permalink / raw)
To: cluster-devel.redhat.com
Due to the design of the VFS, it is quite usual for operations on GFS2
to consist of a lookup (requiring a shared lock) followed by an
operation requiring an exclusive lock. If a remote node has cached an
exclusive lock, then it will receive two demote events in rapid succession
firstly for a shared lock and then to unlocked. The existing min hold time
code was triggering in this case, even if the node was otherwise idle
since the state change time was being updated by the initial demote.
This patch introduces logic to skip the min hold timer in the case that
a "double demote" of this kind has occurred. The min hold timer will
still be used in all other cases.
A new glock flag is introduced which is used to keep track of whether
there have been any newly queued holders since the last glock state
change. The min hold time is only applied if the flag is set.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Tested-by: Abhijith Das <adas@redhat.com>
diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index 9adf8f9..8e478e2 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -441,6 +441,8 @@ static void state_change(struct gfs2_glock *gl, unsigned int new_state)
else
gfs2_glock_put_nolock(gl);
}
+ if (held1 && held2 && list_empty(&gl->gl_holders))
+ clear_bit(GLF_QUEUED, &gl->gl_flags);
gl->gl_state = new_state;
gl->gl_tchange = jiffies;
@@ -1012,6 +1014,7 @@ fail:
if (unlikely((gh->gh_flags & LM_FLAG_PRIORITY) && !insert_pt))
insert_pt = &gh2->gh_list;
}
+ set_bit(GLF_QUEUED, &gl->gl_flags);
if (likely(insert_pt == NULL)) {
list_add_tail(&gh->gh_list, &gl->gl_holders);
if (unlikely(gh->gh_flags & LM_FLAG_PRIORITY))
@@ -1310,10 +1313,12 @@ void gfs2_glock_cb(struct gfs2_glock *gl, unsigned int state)
gfs2_glock_hold(gl);
holdtime = gl->gl_tchange + gl->gl_ops->go_min_hold_time;
- if (time_before(now, holdtime))
- delay = holdtime - now;
- if (test_bit(GLF_REPLY_PENDING, &gl->gl_flags))
- delay = gl->gl_ops->go_min_hold_time;
+ if (test_bit(GLF_QUEUED, &gl->gl_flags)) {
+ if (time_before(now, holdtime))
+ delay = holdtime - now;
+ if (test_bit(GLF_REPLY_PENDING, &gl->gl_flags))
+ delay = gl->gl_ops->go_min_hold_time;
+ }
spin_lock(&gl->gl_spin);
handle_callback(gl, state, delay);
@@ -1660,6 +1665,8 @@ static const char *gflags2str(char *buf, const unsigned long *gflags)
*p++ = 'I';
if (test_bit(GLF_FROZEN, gflags))
*p++ = 'F';
+ if (test_bit(GLF_QUEUED, gflags))
+ *p++ = 'q';
*p = 0;
return buf;
}
diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
index 578234b..b12ca10 100644
--- a/fs/gfs2/incore.h
+++ b/fs/gfs2/incore.h
@@ -196,6 +196,7 @@ enum {
GLF_REPLY_PENDING = 9,
GLF_INITIAL = 10,
GLF_FROZEN = 11,
+ GLF_QUEUED = 12,
};
struct gfs2_glock {
diff --git a/fs/gfs2/trace_gfs2.h b/fs/gfs2/trace_gfs2.h
index 148d55c..cedb0bb 100644
--- a/fs/gfs2/trace_gfs2.h
+++ b/fs/gfs2/trace_gfs2.h
@@ -39,7 +39,8 @@
{(1UL << GLF_INVALIDATE_IN_PROGRESS), "i" }, \
{(1UL << GLF_REPLY_PENDING), "r" }, \
{(1UL << GLF_INITIAL), "I" }, \
- {(1UL << GLF_FROZEN), "F" })
+ {(1UL << GLF_FROZEN), "F" }, \
+ {(1UL << GLF_QUEUED), "q" })
#ifndef NUMPTY
#define NUMPTY
--
1.7.1.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [Cluster-devel] [PATCH 08/22] GFS2: Update handling of DLM return codes to match reality
2010-10-18 14:15 [Cluster-devel] GFS2: Pre-pull patch posting Steven Whitehouse
` (6 preceding siblings ...)
2010-10-18 14:15 ` [Cluster-devel] [PATCH 07/22] GFS2: Don't enforce min hold time when two demotes occur in rapid succession Steven Whitehouse
@ 2010-10-18 14:15 ` Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 09/22] GFS2: Use new workqueue scheme Steven Whitehouse
` (13 subsequent siblings)
21 siblings, 0 replies; 38+ messages in thread
From: Steven Whitehouse @ 2010-10-18 14:15 UTC (permalink / raw)
To: cluster-devel.redhat.com
GFS2's idea of which return codes it needs to handle was based
upon those listed in dlm.h. Those didn't cover all the possible
codes and listed some which never happen. This updates GFS2 to
handle all the codes which can actually be returned from the
DLM under various circumstances.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
diff --git a/fs/gfs2/lock_dlm.c b/fs/gfs2/lock_dlm.c
index 0e0470e..1c09425 100644
--- a/fs/gfs2/lock_dlm.c
+++ b/fs/gfs2/lock_dlm.c
@@ -42,9 +42,9 @@ static void gdlm_ast(void *arg)
ret |= LM_OUT_CANCELED;
goto out;
case -EAGAIN: /* Try lock fails */
+ case -EDEADLK: /* Deadlock detected */
goto out;
- case -EINVAL: /* Invalid */
- case -ENOMEM: /* Out of memory */
+ case -ETIMEDOUT: /* Canceled due to timeout */
ret |= LM_OUT_ERROR;
goto out;
case 0: /* Success */
--
1.7.1.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [Cluster-devel] [PATCH 09/22] GFS2: Use new workqueue scheme
2010-10-18 14:15 [Cluster-devel] GFS2: Pre-pull patch posting Steven Whitehouse
` (7 preceding siblings ...)
2010-10-18 14:15 ` [Cluster-devel] [PATCH 08/22] GFS2: Update handling of DLM return codes to match reality Steven Whitehouse
@ 2010-10-18 14:15 ` Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 10/22] GFS2: Make . and .. qstrs constant Steven Whitehouse
` (12 subsequent siblings)
21 siblings, 0 replies; 38+ messages in thread
From: Steven Whitehouse @ 2010-10-18 14:15 UTC (permalink / raw)
To: cluster-devel.redhat.com
The recovery workqueue can be freezable since
we want it to finish what it is doing if the system is to
be frozen (although why you'd want to freeze a cluster node
is beyond me since it will result in it being ejected from
the cluster). It does still make sense for single node
GFS2 filesystems though.
The glock workqueue will benefit from being able to run more
work items concurrently. A test running postmark shows
improved performance and multi-threaded workloads are likely
to benefit even more. It needs to be high priority because
the latency directly affects the latency of filesystem glock
operations.
The delete workqueue is similar to the recovery workqueue in
that it must not get blocked by memory allocations, and may
run for a long time.
Potentially other GFS2 threads might also be converted to
workqueues, but I'll leave that for a later patch.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Acked-by: Tejun Heo <tj@kernel.org>
diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index 8e478e2..c3f2a5c 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -1783,10 +1783,12 @@ int __init gfs2_glock_init(void)
}
#endif
- glock_workqueue = create_workqueue("glock_workqueue");
+ glock_workqueue = alloc_workqueue("glock_workqueue", WQ_RESCUER |
+ WQ_HIGHPRI | WQ_FREEZEABLE, 0);
if (IS_ERR(glock_workqueue))
return PTR_ERR(glock_workqueue);
- gfs2_delete_workqueue = create_workqueue("delete_workqueue");
+ gfs2_delete_workqueue = alloc_workqueue("delete_workqueue", WQ_RESCUER |
+ WQ_FREEZEABLE, 0);
if (IS_ERR(gfs2_delete_workqueue)) {
destroy_workqueue(glock_workqueue);
return PTR_ERR(gfs2_delete_workqueue);
diff --git a/fs/gfs2/main.c b/fs/gfs2/main.c
index b1e9630..1c8bbf2 100644
--- a/fs/gfs2/main.c
+++ b/fs/gfs2/main.c
@@ -140,7 +140,7 @@ static int __init init_gfs2_fs(void)
error = -ENOMEM;
gfs_recovery_wq = alloc_workqueue("gfs_recovery",
- WQ_NON_REENTRANT | WQ_RESCUER, 0);
+ WQ_RESCUER | WQ_FREEZEABLE, 0);
if (!gfs_recovery_wq)
goto fail_wq;
--
1.7.1.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [Cluster-devel] [PATCH 10/22] GFS2: Make . and .. qstrs constant
2010-10-18 14:15 [Cluster-devel] GFS2: Pre-pull patch posting Steven Whitehouse
` (8 preceding siblings ...)
2010-10-18 14:15 ` [Cluster-devel] [PATCH 09/22] GFS2: Use new workqueue scheme Steven Whitehouse
@ 2010-10-18 14:15 ` Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 11/22] GFS2: Remove ignore_local_fs mount argument Steven Whitehouse
` (11 subsequent siblings)
21 siblings, 0 replies; 38+ messages in thread
From: Steven Whitehouse @ 2010-10-18 14:15 UTC (permalink / raw)
To: cluster-devel.redhat.com
Rather than calculating the qstrs for . and .. each time
we need them, its better to keep a constant version of
these and just refer to them when required.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Reviewed-by: Christoph Hellwig <hch@infradead.org>
diff --git a/fs/gfs2/dir.c b/fs/gfs2/dir.c
index c1042ae..5c356d0 100644
--- a/fs/gfs2/dir.c
+++ b/fs/gfs2/dir.c
@@ -79,6 +79,9 @@
#define gfs2_disk_hash2offset(h) (((u64)(h)) >> 1)
#define gfs2_dir_offset2hash(p) ((u32)(((u64)(p)) << 1))
+struct qstr gfs2_qdot __read_mostly;
+struct qstr gfs2_qdotdot __read_mostly;
+
typedef int (*leaf_call_t) (struct gfs2_inode *dip, u32 index, u32 len,
u64 leaf_no, void *data);
typedef int (*gfs2_dscan_t)(const struct gfs2_dirent *dent,
diff --git a/fs/gfs2/dir.h b/fs/gfs2/dir.h
index 4f91944..a98f644 100644
--- a/fs/gfs2/dir.h
+++ b/fs/gfs2/dir.h
@@ -17,23 +17,24 @@ struct inode;
struct gfs2_inode;
struct gfs2_inum;
-struct inode *gfs2_dir_search(struct inode *dir, const struct qstr *filename);
-int gfs2_dir_check(struct inode *dir, const struct qstr *filename,
- const struct gfs2_inode *ip);
-int gfs2_dir_add(struct inode *inode, const struct qstr *filename,
- const struct gfs2_inode *ip, unsigned int type);
-int gfs2_dir_del(struct gfs2_inode *dip, const struct qstr *filename);
-int gfs2_dir_read(struct inode *inode, u64 *offset, void *opaque,
- filldir_t filldir);
-int gfs2_dir_mvino(struct gfs2_inode *dip, const struct qstr *filename,
- const struct gfs2_inode *nip, unsigned int new_type);
+extern struct inode *gfs2_dir_search(struct inode *dir,
+ const struct qstr *filename);
+extern int gfs2_dir_check(struct inode *dir, const struct qstr *filename,
+ const struct gfs2_inode *ip);
+extern int gfs2_dir_add(struct inode *inode, const struct qstr *filename,
+ const struct gfs2_inode *ip, unsigned int type);
+extern int gfs2_dir_del(struct gfs2_inode *dip, const struct qstr *filename);
+extern int gfs2_dir_read(struct inode *inode, u64 *offset, void *opaque,
+ filldir_t filldir);
+extern int gfs2_dir_mvino(struct gfs2_inode *dip, const struct qstr *filename,
+ const struct gfs2_inode *nip, unsigned int new_type);
-int gfs2_dir_exhash_dealloc(struct gfs2_inode *dip);
+extern int gfs2_dir_exhash_dealloc(struct gfs2_inode *dip);
-int gfs2_diradd_alloc_required(struct inode *dir,
- const struct qstr *filename);
-int gfs2_dir_get_new_buffer(struct gfs2_inode *ip, u64 block,
- struct buffer_head **bhp);
+extern int gfs2_diradd_alloc_required(struct inode *dir,
+ const struct qstr *filename);
+extern int gfs2_dir_get_new_buffer(struct gfs2_inode *ip, u64 block,
+ struct buffer_head **bhp);
static inline u32 gfs2_disk_hash(const char *data, int len)
{
@@ -61,4 +62,7 @@ static inline void gfs2_qstr2dirent(const struct qstr *name, u16 reclen, struct
memcpy(dent + 1, name->name, name->len);
}
+extern struct qstr gfs2_qdot;
+extern struct qstr gfs2_qdotdot;
+
#endif /* __DIR_DOT_H__ */
diff --git a/fs/gfs2/export.c b/fs/gfs2/export.c
index dfe237a..06d5827 100644
--- a/fs/gfs2/export.c
+++ b/fs/gfs2/export.c
@@ -126,16 +126,9 @@ static int gfs2_get_name(struct dentry *parent, char *name,
static struct dentry *gfs2_get_parent(struct dentry *child)
{
- struct qstr dotdot;
struct dentry *dentry;
- /*
- * XXX(hch): it would be a good idea to keep this around as a
- * static variable.
- */
- gfs2_str2qstr(&dotdot, "..");
-
- dentry = d_obtain_alias(gfs2_lookupi(child->d_inode, &dotdot, 1));
+ dentry = d_obtain_alias(gfs2_lookupi(child->d_inode, &gfs2_qdotdot, 1));
if (!IS_ERR(dentry))
dentry->d_op = &gfs2_dops;
return dentry;
diff --git a/fs/gfs2/main.c b/fs/gfs2/main.c
index 1c8bbf2..d7eb1e2 100644
--- a/fs/gfs2/main.c
+++ b/fs/gfs2/main.c
@@ -24,6 +24,7 @@
#include "glock.h"
#include "quota.h"
#include "recovery.h"
+#include "dir.h"
static struct shrinker qd_shrinker = {
.shrink = gfs2_shrink_qd_memory,
@@ -78,6 +79,9 @@ static int __init init_gfs2_fs(void)
{
int error;
+ gfs2_str2qstr(&gfs2_qdot, ".");
+ gfs2_str2qstr(&gfs2_qdotdot, "..");
+
error = gfs2_sys_init();
if (error)
return error;
diff --git a/fs/gfs2/ops_inode.c b/fs/gfs2/ops_inode.c
index ce4f1df..98a94cf 100644
--- a/fs/gfs2/ops_inode.c
+++ b/fs/gfs2/ops_inode.c
@@ -471,18 +471,15 @@ static int gfs2_mkdir(struct inode *dir, struct dentry *dentry, int mode)
if (!gfs2_assert_withdraw(sdp, !error)) {
struct gfs2_dinode *di = (struct gfs2_dinode *)dibh->b_data;
struct gfs2_dirent *dent = (struct gfs2_dirent *)(di+1);
- struct qstr str;
- gfs2_str2qstr(&str, ".");
gfs2_trans_add_bh(ip->i_gl, dibh, 1);
- gfs2_qstr2dirent(&str, GFS2_DIRENT_SIZE(str.len), dent);
+ gfs2_qstr2dirent(&gfs2_qdot, GFS2_DIRENT_SIZE(gfs2_qdot.len), dent);
dent->de_inum = di->di_num; /* already GFS2 endian */
dent->de_type = cpu_to_be16(DT_DIR);
di->di_entries = cpu_to_be32(1);
- gfs2_str2qstr(&str, "..");
dent = (struct gfs2_dirent *)((char*)dent + GFS2_DIRENT_SIZE(1));
- gfs2_qstr2dirent(&str, dibh->b_size - GFS2_DIRENT_SIZE(1) - sizeof(struct gfs2_dinode), dent);
+ gfs2_qstr2dirent(&gfs2_qdotdot, dibh->b_size - GFS2_DIRENT_SIZE(1) - sizeof(struct gfs2_dinode), dent);
gfs2_inum_out(dip, dent);
dent->de_type = cpu_to_be16(DT_DIR);
@@ -523,7 +520,6 @@ static int gfs2_mkdir(struct inode *dir, struct dentry *dentry, int mode)
static int gfs2_rmdiri(struct gfs2_inode *dip, const struct qstr *name,
struct gfs2_inode *ip)
{
- struct qstr dotname;
int error;
if (ip->i_entries != 2) {
@@ -540,13 +536,11 @@ static int gfs2_rmdiri(struct gfs2_inode *dip, const struct qstr *name,
if (error)
return error;
- gfs2_str2qstr(&dotname, ".");
- error = gfs2_dir_del(ip, &dotname);
+ error = gfs2_dir_del(ip, &gfs2_qdot);
if (error)
return error;
- gfs2_str2qstr(&dotname, "..");
- error = gfs2_dir_del(ip, &dotname);
+ error = gfs2_dir_del(ip, &gfs2_qdotdot);
if (error)
return error;
@@ -695,11 +689,8 @@ static int gfs2_ok_to_move(struct gfs2_inode *this, struct gfs2_inode *to)
struct inode *dir = &to->i_inode;
struct super_block *sb = dir->i_sb;
struct inode *tmp;
- struct qstr dotdot;
int error = 0;
- gfs2_str2qstr(&dotdot, "..");
-
igrab(dir);
for (;;) {
@@ -712,7 +703,7 @@ static int gfs2_ok_to_move(struct gfs2_inode *this, struct gfs2_inode *to)
break;
}
- tmp = gfs2_lookupi(dir, &dotdot, 1);
+ tmp = gfs2_lookupi(dir, &gfs2_qdotdot, 1);
if (IS_ERR(tmp)) {
error = PTR_ERR(tmp);
break;
@@ -921,9 +912,6 @@ static int gfs2_rename(struct inode *odir, struct dentry *odentry,
}
if (dir_rename) {
- struct qstr name;
- gfs2_str2qstr(&name, "..");
-
error = gfs2_change_nlink(ndip, +1);
if (error)
goto out_end_trans;
@@ -931,7 +919,7 @@ static int gfs2_rename(struct inode *odir, struct dentry *odentry,
if (error)
goto out_end_trans;
- error = gfs2_dir_mvino(ip, &name, ndip, DT_DIR);
+ error = gfs2_dir_mvino(ip, &gfs2_qdotdot, ndip, DT_DIR);
if (error)
goto out_end_trans;
} else {
--
1.7.1.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [Cluster-devel] [PATCH 11/22] GFS2: Remove ignore_local_fs mount argument
2010-10-18 14:15 [Cluster-devel] GFS2: Pre-pull patch posting Steven Whitehouse
` (9 preceding siblings ...)
2010-10-18 14:15 ` [Cluster-devel] [PATCH 10/22] GFS2: Make . and .. qstrs constant Steven Whitehouse
@ 2010-10-18 14:15 ` Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 12/22] GFS2: Remove localcaching mount option Steven Whitehouse
` (10 subsequent siblings)
21 siblings, 0 replies; 38+ messages in thread
From: Steven Whitehouse @ 2010-10-18 14:15 UTC (permalink / raw)
To: cluster-devel.redhat.com
This is been a no-op for a very long time now. I'm pretty sure
nobody uses it, but just in case we'll still accept it on the
command line, but ignore it.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
index b12ca10..c8a2db1 100644
--- a/fs/gfs2/incore.h
+++ b/fs/gfs2/incore.h
@@ -416,7 +416,6 @@ struct gfs2_args {
char ar_locktable[GFS2_LOCKNAME_LEN]; /* Name of the Lock Table */
char ar_hostdata[GFS2_LOCKNAME_LEN]; /* Host specific data */
unsigned int ar_spectator:1; /* Don't get a journal */
- unsigned int ar_ignore_local_fs:1; /* Ignore optimisations */
unsigned int ar_localflocks:1; /* Let the VFS do flock|fcntl */
unsigned int ar_localcaching:1; /* Local caching */
unsigned int ar_debug:1; /* Oops on errors */
diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index e031fa4..06a4a7e 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -159,7 +159,7 @@ int gfs2_mount_args(struct gfs2_args *args, char *options)
args->ar_spectator = 1;
break;
case Opt_ignore_local_fs:
- args->ar_ignore_local_fs = 1;
+ /* Retained for backwards compat only */
break;
case Opt_localflocks:
args->ar_localflocks = 1;
@@ -1128,7 +1128,6 @@ static int gfs2_remount_fs(struct super_block *sb, int *flags, char *data)
/* Some flags must not be changed */
if (args_neq(&args, &sdp->sd_args, spectator) ||
- args_neq(&args, &sdp->sd_args, ignore_local_fs) ||
args_neq(&args, &sdp->sd_args, localflocks) ||
args_neq(&args, &sdp->sd_args, localcaching) ||
args_neq(&args, &sdp->sd_args, meta))
@@ -1233,8 +1232,6 @@ static int gfs2_show_options(struct seq_file *s, struct vfsmount *mnt)
seq_printf(s, ",hostdata=%s", args->ar_hostdata);
if (args->ar_spectator)
seq_printf(s, ",spectator");
- if (args->ar_ignore_local_fs)
- seq_printf(s, ",ignore_local_fs");
if (args->ar_localflocks)
seq_printf(s, ",localflocks");
if (args->ar_localcaching)
--
1.7.1.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [Cluster-devel] [PATCH 12/22] GFS2: Remove localcaching mount option
2010-10-18 14:15 [Cluster-devel] GFS2: Pre-pull patch posting Steven Whitehouse
` (10 preceding siblings ...)
2010-10-18 14:15 ` [Cluster-devel] [PATCH 11/22] GFS2: Remove ignore_local_fs mount argument Steven Whitehouse
@ 2010-10-18 14:15 ` Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 13/22] GFS2: Remove upgrade " Steven Whitehouse
` (9 subsequent siblings)
21 siblings, 0 replies; 38+ messages in thread
From: Steven Whitehouse @ 2010-10-18 14:15 UTC (permalink / raw)
To: cluster-devel.redhat.com
This option defaulted to on for lock_nolock mounts and off
otherwise. The only function was to avoid the revalidation of
dentries. In the cluster case, that is entirely pointless and
liable to cause coherency problems.
The patch changes the revalidation to depend upon whether the
fs is a local or cluster fs (i.e. it follows the existing default
behaviour). I very much doubt anybody ever used this option as
there is no reason to. Even so we will continue to accept it
on the mount command line, but ignore it.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
diff --git a/fs/gfs2/dentry.c b/fs/gfs2/dentry.c
index bb7907b..6798755 100644
--- a/fs/gfs2/dentry.c
+++ b/fs/gfs2/dentry.c
@@ -49,7 +49,7 @@ static int gfs2_drevalidate(struct dentry *dentry, struct nameidata *nd)
ip = GFS2_I(inode);
}
- if (sdp->sd_args.ar_localcaching)
+ if (sdp->sd_lockstruct.ls_ops->lm_mount == NULL)
goto valid;
had_lock = (gfs2_glock_is_locked_by_me(dip->i_gl) != NULL);
diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
index c8a2db1..2990a0a 100644
--- a/fs/gfs2/incore.h
+++ b/fs/gfs2/incore.h
@@ -417,7 +417,6 @@ struct gfs2_args {
char ar_hostdata[GFS2_LOCKNAME_LEN]; /* Host specific data */
unsigned int ar_spectator:1; /* Don't get a journal */
unsigned int ar_localflocks:1; /* Let the VFS do flock|fcntl */
- unsigned int ar_localcaching:1; /* Local caching */
unsigned int ar_debug:1; /* Oops on errors */
unsigned int ar_upgrade:1; /* Upgrade ondisk format */
unsigned int ar_posix_acl:1; /* Enable posix acls */
diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index 5b5c87d..558bba4 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -1022,7 +1022,6 @@ static int gfs2_lm_mount(struct gfs2_sbd *sdp, int silent)
if (!strcmp("lock_nolock", proto)) {
lm = &nolock_ops;
sdp->sd_args.ar_localflocks = 1;
- sdp->sd_args.ar_localcaching = 1;
#ifdef CONFIG_GFS2_FS_LOCKING_DLM
} else if (!strcmp("lock_dlm", proto)) {
lm = &gfs2_dlm_ops;
diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index 06a4a7e..e78de8b 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -165,7 +165,7 @@ int gfs2_mount_args(struct gfs2_args *args, char *options)
args->ar_localflocks = 1;
break;
case Opt_localcaching:
- args->ar_localcaching = 1;
+ /* Retained for backwards compat only */
break;
case Opt_debug:
if (args->ar_errors == GFS2_ERRORS_PANIC) {
@@ -1129,7 +1129,6 @@ static int gfs2_remount_fs(struct super_block *sb, int *flags, char *data)
/* Some flags must not be changed */
if (args_neq(&args, &sdp->sd_args, spectator) ||
args_neq(&args, &sdp->sd_args, localflocks) ||
- args_neq(&args, &sdp->sd_args, localcaching) ||
args_neq(&args, &sdp->sd_args, meta))
return -EINVAL;
@@ -1234,8 +1233,6 @@ static int gfs2_show_options(struct seq_file *s, struct vfsmount *mnt)
seq_printf(s, ",spectator");
if (args->ar_localflocks)
seq_printf(s, ",localflocks");
- if (args->ar_localcaching)
- seq_printf(s, ",localcaching");
if (args->ar_debug)
seq_printf(s, ",debug");
if (args->ar_upgrade)
--
1.7.1.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [Cluster-devel] [PATCH 13/22] GFS2: Remove upgrade mount option
2010-10-18 14:15 [Cluster-devel] GFS2: Pre-pull patch posting Steven Whitehouse
` (11 preceding siblings ...)
2010-10-18 14:15 ` [Cluster-devel] [PATCH 12/22] GFS2: Remove localcaching mount option Steven Whitehouse
@ 2010-10-18 14:15 ` Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 14/22] GFS2: Fix journal check for spectator mounts Steven Whitehouse
` (8 subsequent siblings)
21 siblings, 0 replies; 38+ messages in thread
From: Steven Whitehouse @ 2010-10-18 14:15 UTC (permalink / raw)
To: cluster-devel.redhat.com
This option has never done anything useful. Also at the same time
this cleans up the sb checks which are done at mount time. The
debug option will be accepted, but ignored in future. Since it
didn't do anything, there didn't seem much point in retaining it.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
index 2990a0a..6f6ff8a 100644
--- a/fs/gfs2/incore.h
+++ b/fs/gfs2/incore.h
@@ -418,7 +418,6 @@ struct gfs2_args {
unsigned int ar_spectator:1; /* Don't get a journal */
unsigned int ar_localflocks:1; /* Let the VFS do flock|fcntl */
unsigned int ar_debug:1; /* Oops on errors */
- unsigned int ar_upgrade:1; /* Upgrade ondisk format */
unsigned int ar_posix_acl:1; /* Enable posix acls */
unsigned int ar_quota:2; /* off/account/on */
unsigned int ar_suiddir:1; /* suiddir support */
diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index 558bba4..5b11f37 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -38,14 +38,6 @@
#define DO 0
#define UNDO 1
-static const u32 gfs2_old_fs_formats[] = {
- 0
-};
-
-static const u32 gfs2_old_multihost_formats[] = {
- 0
-};
-
/**
* gfs2_tune_init - Fill a gfs2_tune structure with default values
* @gt: tune
@@ -135,8 +127,6 @@ static struct gfs2_sbd *init_sbd(struct super_block *sb)
static int gfs2_check_sb(struct gfs2_sbd *sdp, struct gfs2_sb_host *sb, int silent)
{
- unsigned int x;
-
if (sb->sb_magic != GFS2_MAGIC ||
sb->sb_type != GFS2_METATYPE_SB) {
if (!silent)
@@ -150,55 +140,9 @@ static int gfs2_check_sb(struct gfs2_sbd *sdp, struct gfs2_sb_host *sb, int sile
sb->sb_multihost_format == GFS2_FORMAT_MULTI)
return 0;
- if (sb->sb_fs_format != GFS2_FORMAT_FS) {
- for (x = 0; gfs2_old_fs_formats[x]; x++)
- if (gfs2_old_fs_formats[x] == sb->sb_fs_format)
- break;
-
- if (!gfs2_old_fs_formats[x]) {
- printk(KERN_WARNING
- "GFS2: code version (%u, %u) is incompatible "
- "with ondisk format (%u, %u)\n",
- GFS2_FORMAT_FS, GFS2_FORMAT_MULTI,
- sb->sb_fs_format, sb->sb_multihost_format);
- printk(KERN_WARNING
- "GFS2: I don't know how to upgrade this FS\n");
- return -EINVAL;
- }
- }
-
- if (sb->sb_multihost_format != GFS2_FORMAT_MULTI) {
- for (x = 0; gfs2_old_multihost_formats[x]; x++)
- if (gfs2_old_multihost_formats[x] ==
- sb->sb_multihost_format)
- break;
-
- if (!gfs2_old_multihost_formats[x]) {
- printk(KERN_WARNING
- "GFS2: code version (%u, %u) is incompatible "
- "with ondisk format (%u, %u)\n",
- GFS2_FORMAT_FS, GFS2_FORMAT_MULTI,
- sb->sb_fs_format, sb->sb_multihost_format);
- printk(KERN_WARNING
- "GFS2: I don't know how to upgrade this FS\n");
- return -EINVAL;
- }
- }
+ fs_warn(sdp, "Unknown on-disk format, unable to mount\n");
- if (!sdp->sd_args.ar_upgrade) {
- printk(KERN_WARNING
- "GFS2: code version (%u, %u) is incompatible "
- "with ondisk format (%u, %u)\n",
- GFS2_FORMAT_FS, GFS2_FORMAT_MULTI,
- sb->sb_fs_format, sb->sb_multihost_format);
- printk(KERN_INFO
- "GFS2: Use the \"upgrade\" mount option to upgrade "
- "the FS\n");
- printk(KERN_INFO "GFS2: See the manual for more details\n");
- return -EINVAL;
- }
-
- return 0;
+ return -EINVAL;
}
static void end_bio_io_page(struct bio *bio, int error)
diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index e78de8b..d85e0b7 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -179,7 +179,7 @@ int gfs2_mount_args(struct gfs2_args *args, char *options)
args->ar_debug = 0;
break;
case Opt_upgrade:
- args->ar_upgrade = 1;
+ /* Retained for backwards compat only */
break;
case Opt_acl:
args->ar_posix_acl = 1;
@@ -1235,8 +1235,6 @@ static int gfs2_show_options(struct seq_file *s, struct vfsmount *mnt)
seq_printf(s, ",localflocks");
if (args->ar_debug)
seq_printf(s, ",debug");
- if (args->ar_upgrade)
- seq_printf(s, ",upgrade");
if (args->ar_posix_acl)
seq_printf(s, ",acl");
if (args->ar_quota != GFS2_QUOTA_DEFAULT) {
--
1.7.1.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [Cluster-devel] [PATCH 14/22] GFS2: Fix journal check for spectator mounts
2010-10-18 14:15 [Cluster-devel] GFS2: Pre-pull patch posting Steven Whitehouse
` (12 preceding siblings ...)
2010-10-18 14:15 ` [Cluster-devel] [PATCH 13/22] GFS2: Remove upgrade " Steven Whitehouse
@ 2010-10-18 14:15 ` Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 15/22] GFS2: reserve more blocks for transactions Steven Whitehouse
` (7 subsequent siblings)
21 siblings, 0 replies; 38+ messages in thread
From: Steven Whitehouse @ 2010-10-18 14:15 UTC (permalink / raw)
To: cluster-devel.redhat.com
When checking journals for spectator mounts, we cannot rely on the
journal being locked, whatever its jid might be. This patch
ensures that we always get the journal locks when checking
journals for a spectator mount.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
diff --git a/fs/gfs2/recovery.c b/fs/gfs2/recovery.c
index f7f89a9..666548e 100644
--- a/fs/gfs2/recovery.c
+++ b/fs/gfs2/recovery.c
@@ -456,7 +456,8 @@ void gfs2_recover_func(struct work_struct *work)
unsigned int pass;
int error;
- if (jd->jd_jid != sdp->sd_lockstruct.ls_jid) {
+ if (sdp->sd_args.ar_spectator ||
+ (jd->jd_jid != sdp->sd_lockstruct.ls_jid)) {
fs_info(sdp, "jid=%u: Trying to acquire journal lock...\n",
jd->jd_jid);
--
1.7.1.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [Cluster-devel] [PATCH 15/22] GFS2: reserve more blocks for transactions
2010-10-18 14:15 [Cluster-devel] GFS2: Pre-pull patch posting Steven Whitehouse
` (13 preceding siblings ...)
2010-10-18 14:15 ` [Cluster-devel] [PATCH 14/22] GFS2: Fix journal check for spectator mounts Steven Whitehouse
@ 2010-10-18 14:15 ` Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 16/22] GFS2: Fix compiler warning from previous patch Steven Whitehouse
` (6 subsequent siblings)
21 siblings, 0 replies; 38+ messages in thread
From: Steven Whitehouse @ 2010-10-18 14:15 UTC (permalink / raw)
To: cluster-devel.redhat.com
From: Benjamin Marzinski <bmarzins@redhat.com>
Some of the functions in GFS2 were not reserving space in the transaction for
the resource group header and the resource groups bitblocks that get added
when you do allocation. GFS2 now makes sure to reserve space for the
resource group header and either all the bitblocks in the resource group, or
one for each block that it may allocate, whichever is smaller using the new
gfs2_rg_blocks() inline function.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c
index 180ef8a..1bf1788 100644
--- a/fs/gfs2/aops.c
+++ b/fs/gfs2/aops.c
@@ -663,6 +663,8 @@ static int gfs2_write_begin(struct file *file, struct address_space *mapping,
rblocks += RES_STATFS + RES_QUOTA;
if (&ip->i_inode == sdp->sd_rindex)
rblocks += 2 * RES_STATFS;
+ if (alloc_required)
+ rblocks += gfs2_rg_blocks(al);
error = gfs2_trans_begin(sdp, rblocks,
PAGE_CACHE_SIZE/sdp->sd_sb.sb_bsize);
diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
index 04513e9..5476c06 100644
--- a/fs/gfs2/bmap.c
+++ b/fs/gfs2/bmap.c
@@ -1151,7 +1151,7 @@ static int do_grow(struct inode *inode, u64 size)
goto do_grow_qunlock;
}
- error = gfs2_trans_begin(sdp, RES_DINODE + 1, 0);
+ error = gfs2_trans_begin(sdp, RES_DINODE + RES_STATFS + RES_RG_BIT, 0);
if (error)
goto do_grow_release;
diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c
index daadcd2..237ee6a 100644
--- a/fs/gfs2/file.c
+++ b/fs/gfs2/file.c
@@ -382,8 +382,10 @@ static int gfs2_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
rblocks = RES_DINODE + ind_blocks;
if (gfs2_is_jdata(ip))
rblocks += data_blocks ? data_blocks : 1;
- if (ind_blocks || data_blocks)
+ if (ind_blocks || data_blocks) {
rblocks += RES_STATFS + RES_QUOTA;
+ rblocks += gfs2_rg_blocks(al);
+ }
ret = gfs2_trans_begin(sdp, rblocks, 0);
if (ret)
goto out_trans_fail;
diff --git a/fs/gfs2/ops_inode.c b/fs/gfs2/ops_inode.c
index 98a94cf..fba0017 100644
--- a/fs/gfs2/ops_inode.c
+++ b/fs/gfs2/ops_inode.c
@@ -219,7 +219,7 @@ static int gfs2_link(struct dentry *old_dentry, struct inode *dir,
goto out_gunlock_q;
error = gfs2_trans_begin(sdp, sdp->sd_max_dirres +
- al->al_rgd->rd_length +
+ gfs2_rg_blocks(al) +
2 * RES_DINODE + RES_STATFS +
RES_QUOTA, 0);
if (error)
@@ -884,7 +884,7 @@ static int gfs2_rename(struct inode *odir, struct dentry *odentry,
goto out_gunlock_q;
error = gfs2_trans_begin(sdp, sdp->sd_max_dirres +
- al->al_rgd->rd_length +
+ gfs2_rg_blocks(al) +
4 * RES_DINODE + 4 * RES_LEAF +
RES_STATFS + RES_QUOTA + 4, 0);
if (error)
@@ -1481,7 +1481,7 @@ retry:
al->al_requested = data_blocks + ind_blocks;
rblocks = RES_DINODE + ind_blocks + RES_STATFS + RES_QUOTA +
- RES_RG_HDR + ip->i_alloc->al_rgd->rd_length;
+ RES_RG_HDR + gfs2_rg_blocks(al);
if (gfs2_is_jdata(ip))
rblocks += data_blocks ? data_blocks : 1;
diff --git a/fs/gfs2/quota.c b/fs/gfs2/quota.c
index 9bc6dd9..58a9b99 100644
--- a/fs/gfs2/quota.c
+++ b/fs/gfs2/quota.c
@@ -815,7 +815,7 @@ static int do_sync(unsigned int num_qd, struct gfs2_quota_data **qda)
goto out_alloc;
if (nalloc)
- blocks += al->al_rgd->rd_length + nalloc * ind_blocks + RES_STATFS;
+ blocks += gfs2_rg_blocks(al) + nalloc * ind_blocks + RES_STATFS;
error = gfs2_trans_begin(sdp, blocks, 0);
if (error)
@@ -1586,6 +1586,7 @@ static int gfs2_set_dqblk(struct super_block *sb, int type, qid_t id,
error = gfs2_inplace_reserve(ip);
if (error)
goto out_alloc;
+ blocks += gfs2_rg_blocks(al);
}
error = gfs2_trans_begin(sdp, blocks + RES_DINODE + 1, 0);
diff --git a/fs/gfs2/trans.h b/fs/gfs2/trans.h
index b849eb7..fb56b78 100644
--- a/fs/gfs2/trans.h
+++ b/fs/gfs2/trans.h
@@ -26,6 +26,14 @@ struct gfs2_glock;
#define RES_STATFS 1
#define RES_QUOTA 2
+/* reserve either the number of blocks to be allocated plus the rg header
+ * block, or all of the blocks in the rg, whichever is smaller */
+static inline unsigned int gfs2_rg_blocks(const struct gfs2_alloc *al)
+{
+ return (al->al_requested < al->al_rgd->rd_length)?
+ al->al_requested + 1 : al->al_rgd->rd_length;
+}
+
int gfs2_trans_begin(struct gfs2_sbd *sdp, unsigned int blocks,
unsigned int revokes);
diff --git a/fs/gfs2/xattr.c b/fs/gfs2/xattr.c
index 776af6e..30b58f0 100644
--- a/fs/gfs2/xattr.c
+++ b/fs/gfs2/xattr.c
@@ -734,7 +734,7 @@ static int ea_alloc_skeleton(struct gfs2_inode *ip, struct gfs2_ea_request *er,
goto out_gunlock_q;
error = gfs2_trans_begin(GFS2_SB(&ip->i_inode),
- blks + al->al_rgd->rd_length +
+ blks + gfs2_rg_blocks(al) +
RES_DINODE + RES_STATFS + RES_QUOTA, 0);
if (error)
goto out_ipres;
--
1.7.1.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [Cluster-devel] [PATCH 16/22] GFS2: Fix compiler warning from previous patch
2010-10-18 14:15 [Cluster-devel] GFS2: Pre-pull patch posting Steven Whitehouse
` (14 preceding siblings ...)
2010-10-18 14:15 ` [Cluster-devel] [PATCH 15/22] GFS2: reserve more blocks for transactions Steven Whitehouse
@ 2010-10-18 14:15 ` Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 17/22] GFS2: Fix spectator umount issue Steven Whitehouse
` (5 subsequent siblings)
21 siblings, 0 replies; 38+ messages in thread
From: Steven Whitehouse @ 2010-10-18 14:15 UTC (permalink / raw)
To: cluster-devel.redhat.com
This shouldn't really be required, but gcc can't tell that
"al" is only accessed when initialised.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c
index 1bf1788..6b24afb 100644
--- a/fs/gfs2/aops.c
+++ b/fs/gfs2/aops.c
@@ -615,7 +615,7 @@ static int gfs2_write_begin(struct file *file, struct address_space *mapping,
unsigned int data_blocks = 0, ind_blocks = 0, rblocks;
int alloc_required;
int error = 0;
- struct gfs2_alloc *al;
+ struct gfs2_alloc *al = NULL;
pgoff_t index = pos >> PAGE_CACHE_SHIFT;
unsigned from = pos & (PAGE_CACHE_SIZE - 1);
unsigned to = from + len;
--
1.7.1.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [Cluster-devel] [PATCH 17/22] GFS2: Fix spectator umount issue
2010-10-18 14:15 [Cluster-devel] GFS2: Pre-pull patch posting Steven Whitehouse
` (15 preceding siblings ...)
2010-10-18 14:15 ` [Cluster-devel] [PATCH 16/22] GFS2: Fix compiler warning from previous patch Steven Whitehouse
@ 2010-10-18 14:15 ` Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 18/22] GFS2: Add "norecovery" mount option as a synonym for "spectator" Steven Whitehouse
` (4 subsequent siblings)
21 siblings, 0 replies; 38+ messages in thread
From: Steven Whitehouse @ 2010-10-18 14:15 UTC (permalink / raw)
To: cluster-devel.redhat.com
The tests further down the recovery function relating to
unlocking the journal need to be updated to match the
intial test. Also, a test in the umount code which was
surplus to requirements has been removed. Umounting
spectator mounts now works correctly, as expected.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index c3f2a5c..8777885 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -1517,7 +1517,7 @@ static void clear_glock(struct gfs2_glock *gl)
spin_unlock(&lru_lock);
spin_lock(&gl->gl_spin);
- if (find_first_holder(gl) == NULL && gl->gl_state != LM_ST_UNLOCKED)
+ if (gl->gl_state != LM_ST_UNLOCKED)
handle_callback(gl, LM_ST_UNLOCKED, 0);
spin_unlock(&gl->gl_spin);
gfs2_glock_hold(gl);
diff --git a/fs/gfs2/recovery.c b/fs/gfs2/recovery.c
index 666548e..f2a02ed 100644
--- a/fs/gfs2/recovery.c
+++ b/fs/gfs2/recovery.c
@@ -455,12 +455,13 @@ void gfs2_recover_func(struct work_struct *work)
int ro = 0;
unsigned int pass;
int error;
+ int jlocked = 0;
if (sdp->sd_args.ar_spectator ||
(jd->jd_jid != sdp->sd_lockstruct.ls_jid)) {
fs_info(sdp, "jid=%u: Trying to acquire journal lock...\n",
jd->jd_jid);
-
+ jlocked = 1;
/* Acquire the journal lock so we can do recovery */
error = gfs2_glock_nq_num(sdp, jd->jd_jid, &gfs2_journal_glops,
@@ -555,13 +556,12 @@ void gfs2_recover_func(struct work_struct *work)
jd->jd_jid, t);
}
- if (jd->jd_jid != sdp->sd_lockstruct.ls_jid)
- gfs2_glock_dq_uninit(&ji_gh);
-
gfs2_recovery_done(sdp, jd->jd_jid, LM_RD_SUCCESS);
- if (jd->jd_jid != sdp->sd_lockstruct.ls_jid)
+ if (jlocked) {
+ gfs2_glock_dq_uninit(&ji_gh);
gfs2_glock_dq_uninit(&j_gh);
+ }
fs_info(sdp, "jid=%u: Done\n", jd->jd_jid);
goto done;
@@ -569,7 +569,7 @@ void gfs2_recover_func(struct work_struct *work)
fail_gunlock_tr:
gfs2_glock_dq_uninit(&t_gh);
fail_gunlock_ji:
- if (jd->jd_jid != sdp->sd_lockstruct.ls_jid) {
+ if (jlocked) {
gfs2_glock_dq_uninit(&ji_gh);
fail_gunlock_j:
gfs2_glock_dq_uninit(&j_gh);
--
1.7.1.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [Cluster-devel] [PATCH 18/22] GFS2: Add "norecovery" mount option as a synonym for "spectator"
2010-10-18 14:15 [Cluster-devel] GFS2: Pre-pull patch posting Steven Whitehouse
` (16 preceding siblings ...)
2010-10-18 14:15 ` [Cluster-devel] [PATCH 17/22] GFS2: Fix spectator umount issue Steven Whitehouse
@ 2010-10-18 14:15 ` Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 19/22] GFS2: Improve journal allocation via sysfs Steven Whitehouse
` (3 subsequent siblings)
21 siblings, 0 replies; 38+ messages in thread
From: Steven Whitehouse @ 2010-10-18 14:15 UTC (permalink / raw)
To: cluster-devel.redhat.com
XFS supports the "norecovery" mount option which is basically the
same as the GFS2 spectator mode. This adds support for "norecovery"
as a synonym for spectator mode, which is hopefully a more obvious
description of what it actually does.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index d85e0b7..047d117 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -85,6 +85,7 @@ static const match_table_t tokens = {
{Opt_locktable, "locktable=%s"},
{Opt_hostdata, "hostdata=%s"},
{Opt_spectator, "spectator"},
+ {Opt_spectator, "norecovery"},
{Opt_ignore_local_fs, "ignore_local_fs"},
{Opt_localflocks, "localflocks"},
{Opt_localcaching, "localcaching"},
--
1.7.1.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [Cluster-devel] [PATCH 19/22] GFS2: Improve journal allocation via sysfs
2010-10-18 14:15 [Cluster-devel] GFS2: Pre-pull patch posting Steven Whitehouse
` (17 preceding siblings ...)
2010-10-18 14:15 ` [Cluster-devel] [PATCH 18/22] GFS2: Add "norecovery" mount option as a synonym for "spectator" Steven Whitehouse
@ 2010-10-18 14:15 ` Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 20/22] GFS2 fatal: filesystem consistency error on rename Steven Whitehouse
` (2 subsequent siblings)
21 siblings, 0 replies; 38+ messages in thread
From: Steven Whitehouse @ 2010-10-18 14:15 UTC (permalink / raw)
To: cluster-devel.redhat.com
Recently a feature was added to GFS2 to allow journal id allocation
via sysfs. This patch builds upon that so that a negative journal id
will be treated as an error code to be passed back as the return code
from mount. This allows termination of the mount process if there is
a failure.
Also, the process has been updated so that the kernel will wait
for a journal id, even in the "spectator" case. This is required
in order to avoid mounting a filesystem in case there is an error
while joining the cluster. In the spectator case, 0 is written into
the file to indicate that all is well, and that mount should continue.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
index 6f6ff8a..764fbb4 100644
--- a/fs/gfs2/incore.h
+++ b/fs/gfs2/incore.h
@@ -494,7 +494,7 @@ struct gfs2_sb_host {
*/
struct lm_lockstruct {
- unsigned int ls_jid;
+ int ls_jid;
unsigned int ls_first;
unsigned int ls_first_done;
unsigned int ls_nodir;
diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index 5b11f37..aeafc23 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -1056,8 +1056,6 @@ static int gfs2_journalid_wait(void *word)
static int wait_on_journal(struct gfs2_sbd *sdp)
{
- if (sdp->sd_args.ar_spectator)
- return 0;
if (sdp->sd_lockstruct.ls_ops->lm_mount == NULL)
return 0;
@@ -1160,6 +1158,20 @@ static int fill_super(struct super_block *sb, struct gfs2_args *args, int silent
if (error)
goto fail_sb;
+ /*
+ * If user space has failed to join the cluster or some similar
+ * failure has occurred, then the journal id will contain a
+ * negative (error) number. This will then be returned to the
+ * caller (of the mount syscall). We do this even for spectator
+ * mounts (which just write a jid of 0 to indicate "ok" even though
+ * the jid is unused in the spectator case)
+ */
+ if (sdp->sd_lockstruct.ls_jid < 0) {
+ error = sdp->sd_lockstruct.ls_jid;
+ sdp->sd_lockstruct.ls_jid = 0;
+ goto fail_sb;
+ }
+
error = init_inodes(sdp, DO);
if (error)
goto fail_sb;
diff --git a/fs/gfs2/sys.c b/fs/gfs2/sys.c
index ccacffd..64082a5 100644
--- a/fs/gfs2/sys.c
+++ b/fs/gfs2/sys.c
@@ -399,31 +399,32 @@ static ssize_t recover_status_show(struct gfs2_sbd *sdp, char *buf)
static ssize_t jid_show(struct gfs2_sbd *sdp, char *buf)
{
- return sprintf(buf, "%u\n", sdp->sd_lockstruct.ls_jid);
+ return sprintf(buf, "%d\n", sdp->sd_lockstruct.ls_jid);
}
static ssize_t jid_store(struct gfs2_sbd *sdp, const char *buf, size_t len)
{
- unsigned jid;
+ int jid;
int rv;
- rv = sscanf(buf, "%u", &jid);
+ rv = sscanf(buf, "%d", &jid);
if (rv != 1)
return -EINVAL;
spin_lock(&sdp->sd_jindex_spin);
rv = -EINVAL;
- if (sdp->sd_args.ar_spectator)
- goto out;
if (sdp->sd_lockstruct.ls_ops->lm_mount == NULL)
goto out;
rv = -EBUSY;
- if (test_and_clear_bit(SDF_NOJOURNALID, &sdp->sd_flags) == 0)
+ if (test_bit(SDF_NOJOURNALID, &sdp->sd_flags) == 0)
goto out;
+ rv = 0;
+ if (sdp->sd_args.ar_spectator && jid > 0)
+ rv = jid = -EINVAL;
sdp->sd_lockstruct.ls_jid = jid;
+ clear_bit(SDF_NOJOURNALID, &sdp->sd_flags);
smp_mb__after_clear_bit();
wake_up_bit(&sdp->sd_flags, SDF_NOJOURNALID);
- rv = 0;
out:
spin_unlock(&sdp->sd_jindex_spin);
return rv ? rv : len;
@@ -617,7 +618,7 @@ static int gfs2_uevent(struct kset *kset, struct kobject *kobj,
add_uevent_var(env, "LOCKTABLE=%s", sdp->sd_table_name);
add_uevent_var(env, "LOCKPROTO=%s", sdp->sd_proto_name);
if (!test_bit(SDF_NOJOURNALID, &sdp->sd_flags))
- add_uevent_var(env, "JOURNALID=%u", sdp->sd_lockstruct.ls_jid);
+ add_uevent_var(env, "JOURNALID=%d", sdp->sd_lockstruct.ls_jid);
if (gfs2_uuid_valid(uuid))
add_uevent_var(env, "UUID=%pUB", uuid);
return 0;
--
1.7.1.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [Cluster-devel] [PATCH 20/22] GFS2 fatal: filesystem consistency error on rename
2010-10-18 14:15 [Cluster-devel] GFS2: Pre-pull patch posting Steven Whitehouse
` (18 preceding siblings ...)
2010-10-18 14:15 ` [Cluster-devel] [PATCH 19/22] GFS2: Improve journal allocation via sysfs Steven Whitehouse
@ 2010-10-18 14:15 ` Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 21/22] GFS2: Fix type mapping for demote_rq interface Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 22/22] GFS2: fixed typo Steven Whitehouse
21 siblings, 0 replies; 38+ messages in thread
From: Steven Whitehouse @ 2010-10-18 14:15 UTC (permalink / raw)
To: cluster-devel.redhat.com
From: Bob Peterson <rpeterso@redhat.com>
This patch fixes a GFS2 problem whereby the first rename after a
mount can result in a file system consistency error being flagged
improperly and cause the file system to withdraw. The problem is
that the rename code tries to run the rgrp list with function
gfs2_blk2rgrpd before the rgrp list is guaranteed to be read in
from disk. The patch makes the rename function hold the rindex
glock (as the gfs2_unlink code does today) which reads in the rgrp
list if need be. There were a total of three places in the rename
code that improperly referenced the rgrp list without the rindex
glock and this patch fixes all three.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
diff --git a/fs/gfs2/ops_inode.c b/fs/gfs2/ops_inode.c
index fba0017..0534510 100644
--- a/fs/gfs2/ops_inode.c
+++ b/fs/gfs2/ops_inode.c
@@ -736,7 +736,7 @@ static int gfs2_rename(struct inode *odir, struct dentry *odentry,
struct gfs2_inode *ip = GFS2_I(odentry->d_inode);
struct gfs2_inode *nip = NULL;
struct gfs2_sbd *sdp = GFS2_SB(odir);
- struct gfs2_holder ghs[5], r_gh = { .gh_gl = NULL, };
+ struct gfs2_holder ghs[5], r_gh = { .gh_gl = NULL, }, ri_gh;
struct gfs2_rgrpd *nrgd;
unsigned int num_gh;
int dir_rename = 0;
@@ -750,6 +750,9 @@ static int gfs2_rename(struct inode *odir, struct dentry *odentry,
return 0;
}
+ error = gfs2_rindex_hold(sdp, &ri_gh);
+ if (error)
+ return error;
if (odip != ndip) {
error = gfs2_glock_nq_init(sdp->sd_rename_gl, LM_ST_EXCLUSIVE,
@@ -879,7 +882,7 @@ static int gfs2_rename(struct inode *odir, struct dentry *odentry,
al->al_requested = sdp->sd_max_dirres;
- error = gfs2_inplace_reserve(ndip);
+ error = gfs2_inplace_reserve_ri(ndip);
if (error)
goto out_gunlock_q;
@@ -961,6 +964,7 @@ out_gunlock_r:
if (r_gh.gh_gl)
gfs2_glock_dq_uninit(&r_gh);
out:
+ gfs2_glock_dq_uninit(&ri_gh);
return error;
}
diff --git a/fs/gfs2/rgrp.c b/fs/gfs2/rgrp.c
index f9ddcf4..fb67f59 100644
--- a/fs/gfs2/rgrp.c
+++ b/fs/gfs2/rgrp.c
@@ -1200,7 +1200,8 @@ out:
* Returns: errno
*/
-int gfs2_inplace_reserve_i(struct gfs2_inode *ip, char *file, unsigned int line)
+int gfs2_inplace_reserve_i(struct gfs2_inode *ip, int hold_rindex,
+ char *file, unsigned int line)
{
struct gfs2_sbd *sdp = GFS2_SB(&ip->i_inode);
struct gfs2_alloc *al = ip->i_alloc;
@@ -1211,12 +1212,15 @@ int gfs2_inplace_reserve_i(struct gfs2_inode *ip, char *file, unsigned int line)
return -EINVAL;
try_again:
- /* We need to hold the rindex unless the inode we're using is
- the rindex itself, in which case it's already held. */
- if (ip != GFS2_I(sdp->sd_rindex))
- error = gfs2_rindex_hold(sdp, &al->al_ri_gh);
- else if (!sdp->sd_rgrps) /* We may not have the rindex read in, so: */
- error = gfs2_ri_update_special(ip);
+ if (hold_rindex) {
+ /* We need to hold the rindex unless the inode we're using is
+ the rindex itself, in which case it's already held. */
+ if (ip != GFS2_I(sdp->sd_rindex))
+ error = gfs2_rindex_hold(sdp, &al->al_ri_gh);
+ else if (!sdp->sd_rgrps) /* We may not have the rindex read
+ in, so: */
+ error = gfs2_ri_update_special(ip);
+ }
if (error)
return error;
@@ -1227,7 +1231,7 @@ try_again:
try to free it, and try the allocation again. */
error = get_local_rgrp(ip, &unlinked, &last_unlinked);
if (error) {
- if (ip != GFS2_I(sdp->sd_rindex))
+ if (hold_rindex && ip != GFS2_I(sdp->sd_rindex))
gfs2_glock_dq_uninit(&al->al_ri_gh);
if (error != -EAGAIN)
return error;
@@ -1269,7 +1273,7 @@ void gfs2_inplace_release(struct gfs2_inode *ip)
al->al_rgd = NULL;
if (al->al_rgd_gh.gh_gl)
gfs2_glock_dq_uninit(&al->al_rgd_gh);
- if (ip != GFS2_I(sdp->sd_rindex))
+ if (ip != GFS2_I(sdp->sd_rindex) && al->al_ri_gh.gh_gl)
gfs2_glock_dq_uninit(&al->al_ri_gh);
}
diff --git a/fs/gfs2/rgrp.h b/fs/gfs2/rgrp.h
index f07119d..0e35c04 100644
--- a/fs/gfs2/rgrp.h
+++ b/fs/gfs2/rgrp.h
@@ -39,10 +39,12 @@ static inline void gfs2_alloc_put(struct gfs2_inode *ip)
ip->i_alloc = NULL;
}
-extern int gfs2_inplace_reserve_i(struct gfs2_inode *ip, char *file,
- unsigned int line);
+extern int gfs2_inplace_reserve_i(struct gfs2_inode *ip, int hold_rindex,
+ char *file, unsigned int line);
#define gfs2_inplace_reserve(ip) \
-gfs2_inplace_reserve_i((ip), __FILE__, __LINE__)
+ gfs2_inplace_reserve_i((ip), 1, __FILE__, __LINE__)
+#define gfs2_inplace_reserve_ri(ip) \
+ gfs2_inplace_reserve_i((ip), 0, __FILE__, __LINE__)
extern void gfs2_inplace_release(struct gfs2_inode *ip);
--
1.7.1.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [Cluster-devel] [PATCH 21/22] GFS2: Fix type mapping for demote_rq interface
2010-10-18 14:15 [Cluster-devel] GFS2: Pre-pull patch posting Steven Whitehouse
` (19 preceding siblings ...)
2010-10-18 14:15 ` [Cluster-devel] [PATCH 20/22] GFS2 fatal: filesystem consistency error on rename Steven Whitehouse
@ 2010-10-18 14:15 ` Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 22/22] GFS2: fixed typo Steven Whitehouse
21 siblings, 0 replies; 38+ messages in thread
From: Steven Whitehouse @ 2010-10-18 14:15 UTC (permalink / raw)
To: cluster-devel.redhat.com
Mostly the glock operations follow the type of the glock. The
one exception is the transaction glock, so we need to check for
that directly.
Reported-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
diff --git a/fs/gfs2/glops.c b/fs/gfs2/glops.c
index 621d80e..0d149dc 100644
--- a/fs/gfs2/glops.c
+++ b/fs/gfs2/glops.c
@@ -452,7 +452,6 @@ const struct gfs2_glock_operations *gfs2_glops_list[] = {
[LM_TYPE_META] = &gfs2_meta_glops,
[LM_TYPE_INODE] = &gfs2_inode_glops,
[LM_TYPE_RGRP] = &gfs2_rgrp_glops,
- [LM_TYPE_NONDISK] = &gfs2_trans_glops,
[LM_TYPE_IOPEN] = &gfs2_iopen_glops,
[LM_TYPE_FLOCK] = &gfs2_flock_glops,
[LM_TYPE_NONDISK] = &gfs2_nondisk_glops,
diff --git a/fs/gfs2/sys.c b/fs/gfs2/sys.c
index 64082a5..748ccb5 100644
--- a/fs/gfs2/sys.c
+++ b/fs/gfs2/sys.c
@@ -230,7 +230,10 @@ static ssize_t demote_rq_store(struct gfs2_sbd *sdp, const char *buf, size_t len
if (gltype > LM_TYPE_JOURNAL)
return -EINVAL;
- glops = gfs2_glops_list[gltype];
+ if (gltype == LM_TYPE_NONDISK && glnum == GFS2_TRANS_LOCK)
+ glops = &gfs2_trans_glops;
+ else
+ glops = gfs2_glops_list[gltype];
if (glops == NULL)
return -EINVAL;
if (!test_and_set_bit(SDF_DEMOTE, &sdp->sd_flags))
--
1.7.1.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [Cluster-devel] [PATCH 22/22] GFS2: fixed typo
2010-10-18 14:15 [Cluster-devel] GFS2: Pre-pull patch posting Steven Whitehouse
` (20 preceding siblings ...)
2010-10-18 14:15 ` [Cluster-devel] [PATCH 21/22] GFS2: Fix type mapping for demote_rq interface Steven Whitehouse
@ 2010-10-18 14:15 ` Steven Whitehouse
21 siblings, 0 replies; 38+ messages in thread
From: Steven Whitehouse @ 2010-10-18 14:15 UTC (permalink / raw)
To: cluster-devel.redhat.com
From: Andrea Gelmini <andrea.gelmini@gelma.net>
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
diff --git a/fs/gfs2/glock.h b/fs/gfs2/glock.h
index 2bda191..db1c26d 100644
--- a/fs/gfs2/glock.h
+++ b/fs/gfs2/glock.h
@@ -215,7 +215,7 @@ void gfs2_glock_dq_uninit_m(unsigned int num_gh, struct gfs2_holder *ghs);
void gfs2_print_dbg(struct seq_file *seq, const char *fmt, ...);
/**
- * gfs2_glock_nq_init - intialize a holder and enqueue it on a glock
+ * gfs2_glock_nq_init - initialize a holder and enqueue it on a glock
* @gl: the glock
* @state: the state we're requesting
* @flags: the modifier flags
--
1.7.1.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [Cluster-devel] GFS2: Pre-pull patch posting
@ 2013-01-03 11:50 Steven Whitehouse
0 siblings, 0 replies; 38+ messages in thread
From: Steven Whitehouse @ 2013-01-03 11:50 UTC (permalink / raw)
To: cluster-devel.redhat.com
Hi,
Here are four small bug fixes for GFS2. There is no common theme here
really, just a few items that were fixed recently. The first fixes
lock name generation when the glock number is 0. The second fixes a
race allocating reservation structures and the final two fix a performance
issue by making small changes in the allocation code,
Steve.
^ permalink raw reply [flat|nested] 38+ messages in thread
* [Cluster-devel] GFS2: Pre-pull patch posting
@ 2013-04-05 9:57 Steven Whitehouse
0 siblings, 0 replies; 38+ messages in thread
From: Steven Whitehouse @ 2013-04-05 9:57 UTC (permalink / raw)
To: cluster-devel.redhat.com
Hi,
Here are a few GFS2 fixes which are pending. There are two patches
which fix up a couple of minor issues in the DLM interface code,
a missing error path in gfs2_rs_alloc(), two patches which fix problems
during "withdraw" and a fix for discards/FITRIM when using 4k sector
sized devices,
Steve.
^ permalink raw reply [flat|nested] 38+ messages in thread
* [Cluster-devel] GFS2: Pre-pull patch posting
@ 2015-02-10 10:36 Steven Whitehouse
0 siblings, 0 replies; 38+ messages in thread
From: Steven Whitehouse @ 2015-02-10 10:36 UTC (permalink / raw)
To: cluster-devel.redhat.com
This time we have mostly clean ups. There is a bug fix for a NULL dereference
relating to ACLs, and another which improves (but does not fix entirely) an
allocation fall-back code path. The other three patches are small clean ups.
Steve.
^ permalink raw reply [flat|nested] 38+ messages in thread
end of thread, other threads:[~2015-02-10 10:36 UTC | newest]
Thread overview: 38+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-10-18 14:15 [Cluster-devel] GFS2: Pre-pull patch posting Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 01/22] GFS2: New truncate sequence Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 02/22] GFS2: Remove i_disksize Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 03/22] GFS2: No longer experimental Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 04/22] GFS2: Add a bug trap in allocation code Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 05/22] GFS2: fallocate support Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 06/22] GFS2: Fix whitespace in previous patch Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 07/22] GFS2: Don't enforce min hold time when two demotes occur in rapid succession Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 08/22] GFS2: Update handling of DLM return codes to match reality Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 09/22] GFS2: Use new workqueue scheme Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 10/22] GFS2: Make . and .. qstrs constant Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 11/22] GFS2: Remove ignore_local_fs mount argument Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 12/22] GFS2: Remove localcaching mount option Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 13/22] GFS2: Remove upgrade " Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 14/22] GFS2: Fix journal check for spectator mounts Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 15/22] GFS2: reserve more blocks for transactions Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 16/22] GFS2: Fix compiler warning from previous patch Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 17/22] GFS2: Fix spectator umount issue Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 18/22] GFS2: Add "norecovery" mount option as a synonym for "spectator" Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 19/22] GFS2: Improve journal allocation via sysfs Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 20/22] GFS2 fatal: filesystem consistency error on rename Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 21/22] GFS2: Fix type mapping for demote_rq interface Steven Whitehouse
2010-10-18 14:15 ` [Cluster-devel] [PATCH 22/22] GFS2: fixed typo Steven Whitehouse
-- strict thread matches above, loose matches on Subject: below --
2015-02-10 10:36 [Cluster-devel] GFS2: Pre-pull patch posting Steven Whitehouse
2013-04-05 9:57 Steven Whitehouse
2013-01-03 11:50 Steven Whitehouse
2010-08-02 9:27 Steven Whitehouse
2010-05-17 12:40 Steven Whitehouse
2010-03-11 17:21 Steven Whitehouse
2010-03-01 15:08 Steven Whitehouse
2009-09-10 11:27 Steven Whitehouse
2009-06-10 8:30 Steven Whitehouse
2009-03-18 12:23 [Cluster-devel] [GFS2] " swhiteho
2008-12-17 11:29 [Cluster-devel] GFS2: " swhiteho
2008-09-26 12:00 Steven Whitehouse
2008-07-11 10:11 [Cluster-devel] [GFS2] " swhiteho
2008-04-17 8:37 swhiteho
2008-01-21 9:21 swhiteho
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).