All of lore.kernel.org
 help / color / mirror / Atom feed
* [Cluster-devel] [RFC 0/3] GFS2 rename race
@ 2006-12-12  7:53 Wendy Cheng
  2006-12-13  0:30 ` Russell Cattelan
  0 siblings, 1 reply; 3+ messages in thread
From: Wendy Cheng @ 2006-12-12  7:53 UTC (permalink / raw)
  To: cluster-devel.redhat.com

GFS is reported to have a rename race due to the following calling sequence:

1. vfs do_rename() does path lookup (on directories)
2. vfs do_rename() calls lock_rename() for local locking
3. vfs do_rename() does final file names lookup
4. vfs do_rename() calls GFS's rename function

Since (2) is not cluster-aware, by the time GFS rename is invoked in 
(4), another node could have altered (remove or create) the destination 
file *after* dentry is obtained from (3). The stale dentry will fail the 
final round of inode sanity check. The system call is subsequently 
aborted with ENOENT error. Some applications (mostly mail servers) are 
said to have issues with this - the bug is reported as "error when two 
nodes rename two files to same new name".

One possible (simple) fix for the subject bugzilla is to alter vfs 
lock_rename() to take cluster locks. This requires upstream VFS layer 
change (and it does make the described problem go away). Unfortunately, 
this issue turns out to be much more complicated than the bugzilla has 
reported. It is not so much about rename race but a generic GFS inherent 
design issue. Since GFS never bothers to ask other nodes to refresh 
their dentry upon removing the file (this would create stale dentries 
hanging around in other nodes), each node is responsible to check the 
inode validity for relevant operations (e.g. call gfs_unlink_ok()). This 
implies if we do extra lockings in lock_rename(), it makes no difference 
in gfs_rename() - we still need to check gfs_unlink_ok() and it still 
can return ENOENT error, *if the stale dentries have existed before 
rename system call is invoked*.  So we end up giving up the lock_rename 
fix but go for the following three patches:

Patch 3-1: Yank inode refresh logic out of gfs(x)_get_dentry (currently 
only used by NFS server) so both NFS and rename can share the function.
Patch 3-2:  Fix a possible rename deadlock ordering.
Patch 3-3: The core changes for the described bugzilla.

-- Wendy









^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Cluster-devel] [RFC 0/3] GFS2 rename race
  2006-12-12  7:53 [Cluster-devel] [RFC 0/3] GFS2 rename race Wendy Cheng
@ 2006-12-13  0:30 ` Russell Cattelan
  2006-12-13  6:01   ` Wendy Cheng
  0 siblings, 1 reply; 3+ messages in thread
From: Russell Cattelan @ 2006-12-13  0:30 UTC (permalink / raw)
  To: cluster-devel.redhat.com

I applied the three patches

Putz-Less Yum Nut->quilt applied (fix it fool)?yes
gfs2_refresh_inode.patch
gfs2_rename_lock.patch
gfs2_rename_core.patch

s/built-in.o: In function `gfs2_get_dentry':
/fs/gfs2/ops_export.c:294: undefined reference to `gfs2_refresh_inode'


Did I miss something?

On Tue, 2006-12-12 at 02:53 -0500, Wendy Cheng wrote:
> GFS is reported to have a rename race due to the following calling sequence:
> 
> 1. vfs do_rename() does path lookup (on directories)
> 2. vfs do_rename() calls lock_rename() for local locking
> 3. vfs do_rename() does final file names lookup
> 4. vfs do_rename() calls GFS's rename function
> 
> Since (2) is not cluster-aware, by the time GFS rename is invoked in 
> (4), another node could have altered (remove or create) the destination 
> file *after* dentry is obtained from (3). The stale dentry will fail the 
> final round of inode sanity check. The system call is subsequently 
> aborted with ENOENT error. Some applications (mostly mail servers) are 
> said to have issues with this - the bug is reported as "error when two 
> nodes rename two files to same new name".
> 
> One possible (simple) fix for the subject bugzilla is to alter vfs 
> lock_rename() to take cluster locks. This requires upstream VFS layer 
> change (and it does make the described problem go away). Unfortunately, 
> this issue turns out to be much more complicated than the bugzilla has 
> reported. It is not so much about rename race but a generic GFS inherent 
> design issue. Since GFS never bothers to ask other nodes to refresh 
> their dentry upon removing the file (this would create stale dentries 
> hanging around in other nodes), each node is responsible to check the 
> inode validity for relevant operations (e.g. call gfs_unlink_ok()). This 
> implies if we do extra lockings in lock_rename(), it makes no difference 
> in gfs_rename() - we still need to check gfs_unlink_ok() and it still 
> can return ENOENT error, *if the stale dentries have existed before 
> rename system call is invoked*.  So we end up giving up the lock_rename 
> fix but go for the following three patches:
> 
> Patch 3-1: Yank inode refresh logic out of gfs(x)_get_dentry (currently 
> only used by NFS server) so both NFS and rename can share the function.
> Patch 3-2:  Fix a possible rename deadlock ordering.
> Patch 3-3: The core changes for the described bugzilla.
> 
> -- Wendy
> 
> 
> 
> 
> 
> 
-- 
Russell Cattelan <cattelan@redhat.com>



^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Cluster-devel] [RFC 0/3] GFS2 rename race
  2006-12-13  0:30 ` Russell Cattelan
@ 2006-12-13  6:01   ` Wendy Cheng
  0 siblings, 0 replies; 3+ messages in thread
From: Wendy Cheng @ 2006-12-13  6:01 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Russell Cattelan wrote:

>I applied the three patches
>
>Putz-Less Yum Nut->quilt applied (fix it fool)?yes
>gfs2_refresh_inode.patch
>gfs2_rename_lock.patch
>gfs2_rename_core.patch
>
>s/built-in.o: In function `gfs2_get_dentry':
>/fs/gfs2/ops_export.c:294: undefined reference to `gfs2_refresh_inode'
>
>
>Did I miss something?
>
>  
>
No, you didn't. The patches were ported from GFS1 (which were quite 
messy when compared with GFS2). I manually made GFS2 patches one by one 
(for  review purpose). Then I made some function name changes in the 
overall kernel source while doing the final round of testings. 
Apparently I forgot to go back to fix the first patch. The 
gfs2_refresh_inode should have been "gfs2_refresh_iobj". Will submit the 
fixed patch set after I've done with Steve's suggestion (on patch #3). 
Thanks for catching this.

-- Wendy




^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2006-12-13  6:01 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-12-12  7:53 [Cluster-devel] [RFC 0/3] GFS2 rename race Wendy Cheng
2006-12-13  0:30 ` Russell Cattelan
2006-12-13  6:01   ` Wendy Cheng

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.