From mboxrd@z Thu Jan 1 00:00:00 1970 From: Robert Peterson Date: Tue, 08 May 2007 09:56:05 -0500 Subject: [Cluster-devel] [PATCH] GFS2: kernel changes to support new gfs2_grow command (Try 3) In-Reply-To: <20070504193714.GB13775@redhat.com> References: <46394174.2030709@redhat.com> <20070504193714.GB13775@redhat.com> Message-ID: <46408F85.4000904@redhat.com> List-Id: To: cluster-devel.redhat.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit David Teigland wrote: > On Wed, May 02, 2007 at 08:57:08PM -0500, Robert Peterson wrote: >> @@ -447,7 +479,12 @@ static int gfs2_ri_update(struct gfs2_inode *ip) >> u64 junk = ip->i_di.di_size; >> int error; >> >> - if (do_div(junk, sizeof(struct gfs2_rindex))) { >> + /* If someone is holding the rindex file with a glock, they must >> + be updating it, in which case we may have partial entries. >> + In this case, we ignore the partials. */ >> + if (!gfs2_glock_is_held_excl(ip->i_gl) && >> + !gfs2_glock_is_held_shrd(ip->i_gl) && >> + do_div(junk, sizeof(struct gfs2_rindex))) { >> gfs2_consist_inode(ip); >> return -EIO; >> } > > So the use of glock_is_held _is_ part of an assertion, not part of an > algorithm which I was worried about before. We should only ever get to > this spot with a shared glock, right? (rindex_hold takes it). So a plain > old assertion that the glock is shared at the beginning would be ok, but > this particular check doesn't make sense to me. For the sake of completeness, I retested without this change to make sure it was also still necessary. It was. The problem is that we can now call gfs2_ri_update while there are still partial rindex entries. This can happen when we need to allocate a new page to the rindex, which calls gfs2_inplace_reserve_i, which eventually gets to gfs2_check_rindex_version. Without the change, you get: GFS2: fsid=bob_cluster2:test_gfs.0: fatal: filesystem consistency error GFS2: fsid=bob_cluster2:test_gfs.0: inode = 19 25439 GFS2: fsid=bob_cluster2:test_gfs.0: function = gfs2_ri_update, file = fs/gfs2/rgrp.c, line = 486 GFS2: fsid=bob_cluster2:test_gfs.0: about to withdraw this file system GFS2: fsid=bob_cluster2:test_gfs.0: telling LM to withdraw GFS2: fsid=bob_cluster2:test_gfs.0: withdrawn [] gfs2_lm_withdraw+0x82/0x8d [gfs2] [] gfs2_consist_inode_i+0x6f/0x75 [gfs2] [] gfs2_check_rindex_version+0x11e/0x468 [gfs2] [] __sched_text_start+0x715/0x7c4 [] gfs2_extent_map+0x68/0x9c [gfs2] [] gfs2_inplace_reserve_i+0x9a/0x443 [gfs2] [] gfs2_write_alloc_required+0x192/0x1ce [gfs2] [] gfs2_prepare_write+0x12d/0x237 [gfs2] [] gfs2_prepare_write+0x0/0x237 [gfs2] [] generic_file_buffered_write+0x25b/0x60f [] tcp_v4_do_rcv+0x28/0x307 [] __mark_inode_dirty+0xdd/0x15c [] current_fs_time+0x41/0x46 [] __generic_file_aio_write_nolock+0x4e7/0x560 [] get_page_from_freelist+0x24f/0x2cf [] generic_file_aio_write+0x55/0xb3 [] do_sync_write+0xc7/0x10a [] gfs2_holder_uninit+0xb/0x1b [gfs2] [] autoremove_wake_function+0x0/0x35 [] gfs2_holder_uninit+0xb/0x1b [gfs2] [] gfs2_llseek+0x76/0x9a [gfs2] [] do_sync_write+0x0/0x10a [] vfs_write+0x8a/0x10c [] sys_write+0x41/0x67 [] sysenter_past_esp+0x5d/0x81 [] svc_defer+0x6b/0x126 ======================= >> @@ -457,6 +494,9 @@ static int gfs2_ri_update(struct gfs2_inode *ip) >> file_ra_state_init(&ra_state, inode->i_mapping); >> for (sdp->sd_rgrps = 0;; sdp->sd_rgrps++) { >> loff_t pos = sdp->sd_rgrps * sizeof(struct gfs2_rindex); >> + >> + if (pos + sizeof(struct gfs2_rindex) >= ip->i_di.di_size) >> + break; > > Why is this needed now if it wasn't before? For the same reason as the above check is necessary. Regards, Bob Peterson Red Hat Cluster Suite