* [Cluster-devel] GFS2: Pre-pull patch posting (fixes)
@ 2011-04-19 8:40 Steven Whitehouse
2011-04-19 8:40 ` [Cluster-devel] [PATCH 1/4] GFS2: write_end error path fails to unlock transaction lock Steven Whitehouse
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: Steven Whitehouse @ 2011-04-19 8:40 UTC (permalink / raw)
To: cluster-devel.redhat.com
Hi,
Here are four fixes for GFS2. See the individual patches for the detailed
descriptions,
Steve.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Cluster-devel] [PATCH 1/4] GFS2: write_end error path fails to unlock transaction lock
2011-04-19 8:40 [Cluster-devel] GFS2: Pre-pull patch posting (fixes) Steven Whitehouse
@ 2011-04-19 8:40 ` Steven Whitehouse
2011-04-19 8:40 ` [Cluster-devel] [PATCH 2/4] GFS2: directly write blocks past i_size Steven Whitehouse
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Steven Whitehouse @ 2011-04-19 8:40 UTC (permalink / raw)
To: cluster-devel.redhat.com
From: Bob Peterson <rpeterso@redhat.com>
I did an audit of gfs2's transaction glock for bugzilla bug
658619 and ran across this:
In function gfs2_write_end, in the unlikely event that
gfs2_meta_inode_buffer returns an error, the code may forget
to unlock the transaction lock because the "failed" label
appears after the call to function gfs2_trans_end.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c
index c71995b..0f5c4f9 100644
--- a/fs/gfs2/aops.c
+++ b/fs/gfs2/aops.c
@@ -884,8 +884,8 @@ static int gfs2_write_end(struct file *file, struct address_space *mapping,
}
brelse(dibh);
- gfs2_trans_end(sdp);
failed:
+ gfs2_trans_end(sdp);
if (al) {
gfs2_inplace_release(ip);
gfs2_quota_unlock(ip);
--
1.7.4
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [Cluster-devel] [PATCH 2/4] GFS2: directly write blocks past i_size
2011-04-19 8:40 [Cluster-devel] GFS2: Pre-pull patch posting (fixes) Steven Whitehouse
2011-04-19 8:40 ` [Cluster-devel] [PATCH 1/4] GFS2: write_end error path fails to unlock transaction lock Steven Whitehouse
@ 2011-04-19 8:40 ` Steven Whitehouse
2011-04-19 8:40 ` [Cluster-devel] [PATCH 3/4] GFS2: Don't try to deallocate unlinked inodes when mounted ro Steven Whitehouse
2011-04-19 8:40 ` [Cluster-devel] [PATCH 4/4] GFS2: filesystem hang caused by incorrect lock order Steven Whitehouse
3 siblings, 0 replies; 5+ messages in thread
From: Steven Whitehouse @ 2011-04-19 8:40 UTC (permalink / raw)
To: cluster-devel.redhat.com
From: Benjamin Marzinski <bmarzins@redhat.com>
GFS2 was relying on the writepage code to write out the zeroed data for
fallocate. However, with FALLOC_FL_KEEP_SIZE set, this may be past i_size.
If it is, it will be ignored. To work around this, gfs2 now calls
write_dirty_buffer directly on the buffer_heads when FALLOC_FL_KEEP_SIZE
is set, and it's writing past i_size.
This version is just a cleanup of my last version
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c
index b2682e0..e483108 100644
--- a/fs/gfs2/file.c
+++ b/fs/gfs2/file.c
@@ -617,18 +617,51 @@ static ssize_t gfs2_file_aio_write(struct kiocb *iocb, const struct iovec *iov,
return generic_file_aio_write(iocb, iov, nr_segs, pos);
}
-static void empty_write_end(struct page *page, unsigned from,
- unsigned to)
+static int empty_write_end(struct page *page, unsigned from,
+ unsigned to, int mode)
{
- struct gfs2_inode *ip = GFS2_I(page->mapping->host);
+ struct inode *inode = page->mapping->host;
+ struct gfs2_inode *ip = GFS2_I(inode);
+ struct buffer_head *bh;
+ unsigned offset, blksize = 1 << inode->i_blkbits;
+ pgoff_t end_index = i_size_read(inode) >> PAGE_CACHE_SHIFT;
zero_user(page, from, to-from);
mark_page_accessed(page);
- if (!gfs2_is_writeback(ip))
- gfs2_page_add_databufs(ip, page, from, to);
+ if (page->index < end_index || !(mode & FALLOC_FL_KEEP_SIZE)) {
+ if (!gfs2_is_writeback(ip))
+ gfs2_page_add_databufs(ip, page, from, to);
+
+ block_commit_write(page, from, to);
+ return 0;
+ }
+
+ offset = 0;
+ bh = page_buffers(page);
+ while (offset < to) {
+ if (offset >= from) {
+ set_buffer_uptodate(bh);
+ mark_buffer_dirty(bh);
+ clear_buffer_new(bh);
+ write_dirty_buffer(bh, WRITE);
+ }
+ offset += blksize;
+ bh = bh->b_this_page;
+ }
- block_commit_write(page, from, to);
+ offset = 0;
+ bh = page_buffers(page);
+ while (offset < to) {
+ if (offset >= from) {
+ wait_on_buffer(bh);
+ if (!buffer_uptodate(bh))
+ return -EIO;
+ }
+ offset += blksize;
+ bh = bh->b_this_page;
+ }
+ return 0;
}
static int needs_empty_write(sector_t block, struct inode *inode)
@@ -643,7 +676,8 @@ static int needs_empty_write(sector_t block, struct inode *inode)
return !buffer_mapped(&bh_map);
}
-static int write_empty_blocks(struct page *page, unsigned from, unsigned to)
+static int write_empty_blocks(struct page *page, unsigned from, unsigned to,
+ int mode)
{
struct inode *inode = page->mapping->host;
unsigned start, end, next, blksize;
@@ -668,7 +702,9 @@ static int write_empty_blocks(struct page *page, unsigned from, unsigned to)
gfs2_block_map);
if (unlikely(ret))
return ret;
- empty_write_end(page, start, end);
+ ret = empty_write_end(page, start, end, mode);
+ if (unlikely(ret))
+ return ret;
end = 0;
}
start = next;
@@ -682,7 +718,9 @@ static int write_empty_blocks(struct page *page, unsigned from, unsigned to)
ret = __block_write_begin(page, start, end - start, gfs2_block_map);
if (unlikely(ret))
return ret;
- empty_write_end(page, start, end);
+ ret = empty_write_end(page, start, end, mode);
+ if (unlikely(ret))
+ return ret;
}
return 0;
@@ -731,7 +769,7 @@ static int fallocate_chunk(struct inode *inode, loff_t offset, loff_t len,
if (curr == end)
to = end_offset;
- error = write_empty_blocks(page, from, to);
+ error = write_empty_blocks(page, from, to, mode);
if (!error && offset + to > inode->i_size &&
!(mode & FALLOC_FL_KEEP_SIZE)) {
i_size_write(inode, offset + to);
--
1.7.4
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [Cluster-devel] [PATCH 3/4] GFS2: Don't try to deallocate unlinked inodes when mounted ro
2011-04-19 8:40 [Cluster-devel] GFS2: Pre-pull patch posting (fixes) Steven Whitehouse
2011-04-19 8:40 ` [Cluster-devel] [PATCH 1/4] GFS2: write_end error path fails to unlock transaction lock Steven Whitehouse
2011-04-19 8:40 ` [Cluster-devel] [PATCH 2/4] GFS2: directly write blocks past i_size Steven Whitehouse
@ 2011-04-19 8:40 ` Steven Whitehouse
2011-04-19 8:40 ` [Cluster-devel] [PATCH 4/4] GFS2: filesystem hang caused by incorrect lock order Steven Whitehouse
3 siblings, 0 replies; 5+ messages in thread
From: Steven Whitehouse @ 2011-04-19 8:40 UTC (permalink / raw)
To: cluster-devel.redhat.com
This adds a couple of missing tests to avoid read-only nodes
from attempting to deallocate unlinked inodes.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Reported-by: Michel Andre de la Porte <madelaporte@ubi.com>
diff --git a/fs/gfs2/glops.c b/fs/gfs2/glops.c
index 3754e3c..25eeb2b 100644
--- a/fs/gfs2/glops.c
+++ b/fs/gfs2/glops.c
@@ -385,6 +385,10 @@ static int trans_go_demote_ok(const struct gfs2_glock *gl)
static void iopen_go_callback(struct gfs2_glock *gl)
{
struct gfs2_inode *ip = (struct gfs2_inode *)gl->gl_object;
+ struct gfs2_sbd *sdp = gl->gl_sbd;
+
+ if (sdp->sd_vfs->s_flags & MS_RDONLY)
+ return;
if (gl->gl_demote_state == LM_ST_UNLOCKED &&
gl->gl_state == LM_ST_SHARED && ip) {
diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index a4e23d6..5b2cb81 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -1318,12 +1318,13 @@ static int gfs2_show_options(struct seq_file *s, struct vfsmount *mnt)
static void gfs2_evict_inode(struct inode *inode)
{
- struct gfs2_sbd *sdp = inode->i_sb->s_fs_info;
+ struct super_block *sb = inode->i_sb;
+ struct gfs2_sbd *sdp = sb->s_fs_info;
struct gfs2_inode *ip = GFS2_I(inode);
struct gfs2_holder gh;
int error;
- if (inode->i_nlink)
+ if (inode->i_nlink || (sb->s_flags & MS_RDONLY))
goto out;
error = gfs2_glock_nq_init(ip->i_gl, LM_ST_EXCLUSIVE, 0, &gh);
--
1.7.4
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [Cluster-devel] [PATCH 4/4] GFS2: filesystem hang caused by incorrect lock order
2011-04-19 8:40 [Cluster-devel] GFS2: Pre-pull patch posting (fixes) Steven Whitehouse
` (2 preceding siblings ...)
2011-04-19 8:40 ` [Cluster-devel] [PATCH 3/4] GFS2: Don't try to deallocate unlinked inodes when mounted ro Steven Whitehouse
@ 2011-04-19 8:40 ` Steven Whitehouse
3 siblings, 0 replies; 5+ messages in thread
From: Steven Whitehouse @ 2011-04-19 8:40 UTC (permalink / raw)
To: cluster-devel.redhat.com
From: Bob Peterson <rpeterso@redhat.com>
This patch fixes a deadlock in GFS2 where two processes are trying
to reclaim an unlinked dinode:
One holds the inode glock and calls gfs2_lookup_by_inum trying to look
up the inode, which it can't, due to I_FREEING. The other has set
I_FREEING from vfs and is at the beginning of gfs2_delete_inode
waiting for the glock, which is held by the first. The solution is to
add a new non_block parameter to the gfs2_iget function that causes it
to return -ENOENT if the inode is being freed.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
diff --git a/fs/gfs2/dir.c b/fs/gfs2/dir.c
index 5c356d0..f789c57 100644
--- a/fs/gfs2/dir.c
+++ b/fs/gfs2/dir.c
@@ -1506,7 +1506,7 @@ struct inode *gfs2_dir_search(struct inode *dir, const struct qstr *name)
inode = gfs2_inode_lookup(dir->i_sb,
be16_to_cpu(dent->de_type),
be64_to_cpu(dent->de_inum.no_addr),
- be64_to_cpu(dent->de_inum.no_formal_ino));
+ be64_to_cpu(dent->de_inum.no_formal_ino), 0);
brelse(bh);
return inode;
}
diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index 97d54a2..9134dcb 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -40,37 +40,61 @@ struct gfs2_inum_range_host {
u64 ir_length;
};
+struct gfs2_skip_data {
+ u64 no_addr;
+ int skipped;
+ int non_block;
+};
+
static int iget_test(struct inode *inode, void *opaque)
{
struct gfs2_inode *ip = GFS2_I(inode);
- u64 *no_addr = opaque;
+ struct gfs2_skip_data *data = opaque;
- if (ip->i_no_addr == *no_addr)
+ if (ip->i_no_addr == data->no_addr) {
+ if (data->non_block &&
+ inode->i_state & (I_FREEING|I_CLEAR|I_WILL_FREE)) {
+ data->skipped = 1;
+ return 0;
+ }
return 1;
-
+ }
return 0;
}
static int iget_set(struct inode *inode, void *opaque)
{
struct gfs2_inode *ip = GFS2_I(inode);
- u64 *no_addr = opaque;
+ struct gfs2_skip_data *data = opaque;
- inode->i_ino = (unsigned long)*no_addr;
- ip->i_no_addr = *no_addr;
+ if (data->skipped)
+ return -ENOENT;
+ inode->i_ino = (unsigned long)(data->no_addr);
+ ip->i_no_addr = data->no_addr;
return 0;
}
struct inode *gfs2_ilookup(struct super_block *sb, u64 no_addr)
{
unsigned long hash = (unsigned long)no_addr;
- return ilookup5(sb, hash, iget_test, &no_addr);
+ struct gfs2_skip_data data;
+
+ data.no_addr = no_addr;
+ data.skipped = 0;
+ data.non_block = 0;
+ return ilookup5(sb, hash, iget_test, &data);
}
-static struct inode *gfs2_iget(struct super_block *sb, u64 no_addr)
+static struct inode *gfs2_iget(struct super_block *sb, u64 no_addr,
+ int non_block)
{
+ struct gfs2_skip_data data;
unsigned long hash = (unsigned long)no_addr;
- return iget5_locked(sb, hash, iget_test, iget_set, &no_addr);
+
+ data.no_addr = no_addr;
+ data.skipped = 0;
+ data.non_block = non_block;
+ return iget5_locked(sb, hash, iget_test, iget_set, &data);
}
/**
@@ -111,19 +135,20 @@ static void gfs2_set_iop(struct inode *inode)
* @sb: The super block
* @no_addr: The inode number
* @type: The type of the inode
+ * non_block: Can we block on inodes that are being freed?
*
* Returns: A VFS inode, or an error
*/
struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned int type,
- u64 no_addr, u64 no_formal_ino)
+ u64 no_addr, u64 no_formal_ino, int non_block)
{
struct inode *inode;
struct gfs2_inode *ip;
struct gfs2_glock *io_gl = NULL;
int error;
- inode = gfs2_iget(sb, no_addr);
+ inode = gfs2_iget(sb, no_addr, non_block);
ip = GFS2_I(inode);
if (!inode)
@@ -185,11 +210,12 @@ struct inode *gfs2_lookup_by_inum(struct gfs2_sbd *sdp, u64 no_addr,
{
struct super_block *sb = sdp->sd_vfs;
struct gfs2_holder i_gh;
- struct inode *inode;
+ struct inode *inode = NULL;
int error;
+ /* Must not read in block until block type is verified */
error = gfs2_glock_nq_num(sdp, no_addr, &gfs2_inode_glops,
- LM_ST_SHARED, LM_FLAG_ANY, &i_gh);
+ LM_ST_EXCLUSIVE, GL_SKIP, &i_gh);
if (error)
return ERR_PTR(error);
@@ -197,7 +223,7 @@ struct inode *gfs2_lookup_by_inum(struct gfs2_sbd *sdp, u64 no_addr,
if (error)
goto fail;
- inode = gfs2_inode_lookup(sb, DT_UNKNOWN, no_addr, 0);
+ inode = gfs2_inode_lookup(sb, DT_UNKNOWN, no_addr, 0, 1);
if (IS_ERR(inode))
goto fail;
@@ -843,7 +869,7 @@ struct inode *gfs2_createi(struct gfs2_holder *ghs, const struct qstr *name,
goto fail_gunlock2;
inode = gfs2_inode_lookup(dir->i_sb, IF2DT(mode), inum.no_addr,
- inum.no_formal_ino);
+ inum.no_formal_ino, 0);
if (IS_ERR(inode))
goto fail_gunlock2;
diff --git a/fs/gfs2/inode.h b/fs/gfs2/inode.h
index 3e00a66..099ca30 100644
--- a/fs/gfs2/inode.h
+++ b/fs/gfs2/inode.h
@@ -97,7 +97,8 @@ err:
}
extern struct inode *gfs2_inode_lookup(struct super_block *sb, unsigned type,
- u64 no_addr, u64 no_formal_ino);
+ u64 no_addr, u64 no_formal_ino,
+ int non_block);
extern struct inode *gfs2_lookup_by_inum(struct gfs2_sbd *sdp, u64 no_addr,
u64 *no_formal_ino,
unsigned int blktype);
diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index 42ef243..d3c69eb 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -430,7 +430,7 @@ static int gfs2_lookup_root(struct super_block *sb, struct dentry **dptr,
struct dentry *dentry;
struct inode *inode;
- inode = gfs2_inode_lookup(sb, DT_DIR, no_addr, 0);
+ inode = gfs2_inode_lookup(sb, DT_DIR, no_addr, 0, 0);
if (IS_ERR(inode)) {
fs_err(sdp, "can't read in %s inode: %ld\n", name, PTR_ERR(inode));
return PTR_ERR(inode);
diff --git a/fs/gfs2/rgrp.c b/fs/gfs2/rgrp.c
index cf930cd..6fcae84 100644
--- a/fs/gfs2/rgrp.c
+++ b/fs/gfs2/rgrp.c
@@ -945,7 +945,7 @@ static void try_rgrp_unlink(struct gfs2_rgrpd *rgd, u64 *last_unlinked, u64 skip
/* rgblk_search can return a block < goal, so we need to
keep it marching forward. */
no_addr = block + rgd->rd_data0;
- goal++;
+ goal = max(block + 1, goal + 1);
if (*last_unlinked != NO_BLOCK && no_addr <= *last_unlinked)
continue;
if (no_addr == skip)
@@ -971,7 +971,7 @@ static void try_rgrp_unlink(struct gfs2_rgrpd *rgd, u64 *last_unlinked, u64 skip
found++;
/* Limit reclaim to sensible number of tasks */
- if (found > 2*NR_CPUS)
+ if (found > NR_CPUS)
return;
}
diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index 5b2cb81..b9f28e6 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -1327,7 +1327,8 @@ static void gfs2_evict_inode(struct inode *inode)
if (inode->i_nlink || (sb->s_flags & MS_RDONLY))
goto out;
- error = gfs2_glock_nq_init(ip->i_gl, LM_ST_EXCLUSIVE, 0, &gh);
+ /* Must not read inode block until block type has been verified */
+ error = gfs2_glock_nq_init(ip->i_gl, LM_ST_EXCLUSIVE, GL_SKIP, &gh);
if (unlikely(error)) {
gfs2_glock_dq_uninit(&ip->i_iopen_gh);
goto out;
@@ -1337,6 +1338,12 @@ static void gfs2_evict_inode(struct inode *inode)
if (error)
goto out_truncate;
+ if (test_bit(GIF_INVALID, &ip->i_flags)) {
+ error = gfs2_inode_refresh(ip);
+ if (error)
+ goto out_truncate;
+ }
+
ip->i_iopen_gh.gh_flags |= GL_NOCACHE;
gfs2_glock_dq_wait(&ip->i_iopen_gh);
gfs2_holder_reinit(LM_ST_EXCLUSIVE, LM_FLAG_TRY_1CB | GL_NOCACHE, &ip->i_iopen_gh);
--
1.7.4
^ permalink raw reply related [flat|nested] 5+ messages in thread
end of thread, other threads:[~2011-04-19 8:40 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-04-19 8:40 [Cluster-devel] GFS2: Pre-pull patch posting (fixes) Steven Whitehouse
2011-04-19 8:40 ` [Cluster-devel] [PATCH 1/4] GFS2: write_end error path fails to unlock transaction lock Steven Whitehouse
2011-04-19 8:40 ` [Cluster-devel] [PATCH 2/4] GFS2: directly write blocks past i_size Steven Whitehouse
2011-04-19 8:40 ` [Cluster-devel] [PATCH 3/4] GFS2: Don't try to deallocate unlinked inodes when mounted ro Steven Whitehouse
2011-04-19 8:40 ` [Cluster-devel] [PATCH 4/4] GFS2: filesystem hang caused by incorrect lock order Steven Whitehouse
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).