* [PATCH] dcache: fix d_splice_alias handling of aliases @ 2014-01-15 15:17 J. Bruce Fields 2014-01-15 17:34 ` Miklos Szeredi ` (2 more replies) 0 siblings, 3 replies; 21+ messages in thread From: J. Bruce Fields @ 2014-01-15 15:17 UTC (permalink / raw) To: Al Viro Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-nfs-u79uwXL29TY76Z2rM5mHXA, Miklos Szeredi From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> d_splice_alias can create duplicate directory aliases (in the !new case), or (in the new case) d_move without holding appropriate locks. d_materialise_unique deals with both of these problems. (The latter seems to be dealt by trylocks (see __d_unalias), which look like they could cause spurious lookup failures--but that's at least better than corrupting the dcache.) Signed-off-by: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> --- fs/dcache.c | 25 +------------------------ 1 file changed, 1 insertion(+), 24 deletions(-) Only lightly tested.... If this is right, then we can also just ditch d_splice_alias completely, and clean up the various d_find_alias's. I think the only reason we have both d_splice_alias and d_materialise_unique is that the former was written for exportable filesystems and the latter for distributed filesystems. But we have at least one exportable filesystem (fuse) using d_materialise_unique. And I doubt d_splice_alias was ever completely correct even for on-disk filesystems. Am I missing some subtlety? --b. diff --git a/fs/dcache.c b/fs/dcache.c index 4bdb300..da82fa9 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -1926,33 +1926,10 @@ EXPORT_SYMBOL(d_obtain_alias); */ struct dentry *d_splice_alias(struct inode *inode, struct dentry *dentry) { - struct dentry *new = NULL; - if (IS_ERR(inode)) return ERR_CAST(inode); - if (inode && S_ISDIR(inode->i_mode)) { - spin_lock(&inode->i_lock); - new = __d_find_alias(inode, 1); - if (new) { - BUG_ON(!(new->d_flags & DCACHE_DISCONNECTED)); - spin_unlock(&inode->i_lock); - security_d_instantiate(new, inode); - d_move(new, dentry); - iput(inode); - } else { - /* already taking inode->i_lock, so d_add() by hand */ - __d_instantiate(dentry, inode); - spin_unlock(&inode->i_lock); - security_d_instantiate(dentry, inode); - d_rehash(dentry); - } - } else { - d_instantiate(dentry, inode); - if (d_unhashed(dentry)) - d_rehash(dentry); - } - return new; + return d_materialise_unique(dentry, inode); } EXPORT_SYMBOL(d_splice_alias); -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH] dcache: fix d_splice_alias handling of aliases 2014-01-15 15:17 [PATCH] dcache: fix d_splice_alias handling of aliases J. Bruce Fields @ 2014-01-15 17:34 ` Miklos Szeredi 2014-01-15 17:57 ` J. Bruce Fields [not found] ` <20140115151749.GF23999-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> 2014-01-17 12:17 ` Christoph Hellwig 2 siblings, 1 reply; 21+ messages in thread From: Miklos Szeredi @ 2014-01-15 17:34 UTC (permalink / raw) To: J. Bruce Fields; +Cc: Al Viro, linux-fsdevel, linux-kernel, linux-nfs, miklos On Wed, 2014-01-15 at 10:17 -0500, J. Bruce Fields wrote: > From: "J. Bruce Fields" <bfields@redhat.com> > > d_splice_alias can create duplicate directory aliases (in the !new > case), or (in the new case) d_move without holding appropriate locks. It can d_move, because the dentry is known to be disconnected, i.e. it doesn't have a parent for which we could obtain the lock. > d_materialise_unique deals with both of these problems. (The latter > seems to be dealt by trylocks (see __d_unalias), which look like they > could cause spurious lookup failures--but that's at least better than > corrupting the dcache.) > > Signed-off-by: J. Bruce Fields <bfields@redhat.com> > --- > fs/dcache.c | 25 +------------------------ > 1 file changed, 1 insertion(+), 24 deletions(-) > > Only lightly tested.... If this is right, then we can also just ditch > d_splice_alias completely, and clean up the various d_find_alias's. > > I think the only reason we have both d_splice_alias and > d_materialise_unique is that the former was written for exportable > filesystems and the latter for distributed filesystems. > > But we have at least one exportable filesystem (fuse) using > d_materialise_unique. And I doubt d_splice_alias was ever completely > correct even for on-disk filesystems. > > Am I missing some subtlety? One subtle difference is that for a non-directory d_splice_alias() will reconnect a DCACHE_DISCONNECTED dentry if one exists, while d_materialise_unique() will not. Does this matter in practice? The small number of extra dentries probably does not matter. Thanks, Miklos > > --b. > > diff --git a/fs/dcache.c b/fs/dcache.c > index 4bdb300..da82fa9 100644 > --- a/fs/dcache.c > +++ b/fs/dcache.c > @@ -1926,33 +1926,10 @@ EXPORT_SYMBOL(d_obtain_alias); > */ > struct dentry *d_splice_alias(struct inode *inode, struct dentry *dentry) > { > - struct dentry *new = NULL; > - > if (IS_ERR(inode)) > return ERR_CAST(inode); > > - if (inode && S_ISDIR(inode->i_mode)) { > - spin_lock(&inode->i_lock); > - new = __d_find_alias(inode, 1); > - if (new) { > - BUG_ON(!(new->d_flags & DCACHE_DISCONNECTED)); > - spin_unlock(&inode->i_lock); > - security_d_instantiate(new, inode); > - d_move(new, dentry); > - iput(inode); > - } else { > - /* already taking inode->i_lock, so d_add() by hand */ > - __d_instantiate(dentry, inode); > - spin_unlock(&inode->i_lock); > - security_d_instantiate(dentry, inode); > - d_rehash(dentry); > - } > - } else { > - d_instantiate(dentry, inode); > - if (d_unhashed(dentry)) > - d_rehash(dentry); > - } > - return new; > + return d_materialise_unique(dentry, inode); > } > EXPORT_SYMBOL(d_splice_alias); > ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] dcache: fix d_splice_alias handling of aliases 2014-01-15 17:34 ` Miklos Szeredi @ 2014-01-15 17:57 ` J. Bruce Fields [not found] ` <20140115175723.GA4596-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> 0 siblings, 1 reply; 21+ messages in thread From: J. Bruce Fields @ 2014-01-15 17:57 UTC (permalink / raw) To: Miklos Szeredi; +Cc: Al Viro, linux-fsdevel, linux-kernel, linux-nfs, miklos On Wed, Jan 15, 2014 at 06:34:56PM +0100, Miklos Szeredi wrote: > On Wed, 2014-01-15 at 10:17 -0500, J. Bruce Fields wrote: > > From: "J. Bruce Fields" <bfields@redhat.com> > > > > d_splice_alias can create duplicate directory aliases (in the !new > > case), or (in the new case) d_move without holding appropriate locks. > > It can d_move, because the dentry is known to be disconnected, i.e. it > doesn't have a parent for which we could obtain the lock. DCACHE_DISCONNECTED doesn't mean that. When you lookup a dentry by filehandle that dentry is initially marked DCACHE_DISCONNECTED. It is cleared only after reconnect_path() has verified that the dentry is reachable all the way from the root. So !DCACHE_DISCONNECTED implies that the dentry is connected all the way up to the root, but the converse is not true. This has been a source of confusion, but it is explained in Documentation/filesystems/nfs/Exporting. Recently I've cleaned up a few odd uses of DCACHE_DISCONNECTED and rewritten reconnect_path(), partly as an attempt to clarify the situation. Let me know if any of that doesn't look right to you.... > > d_materialise_unique deals with both of these problems. (The latter > > seems to be dealt by trylocks (see __d_unalias), which look like they > > could cause spurious lookup failures--but that's at least better than > > corrupting the dcache.) > > > > Signed-off-by: J. Bruce Fields <bfields@redhat.com> > > --- > > fs/dcache.c | 25 +------------------------ > > 1 file changed, 1 insertion(+), 24 deletions(-) > > > > Only lightly tested.... If this is right, then we can also just ditch > > d_splice_alias completely, and clean up the various d_find_alias's. > > > > I think the only reason we have both d_splice_alias and > > d_materialise_unique is that the former was written for exportable > > filesystems and the latter for distributed filesystems. > > > > But we have at least one exportable filesystem (fuse) using > > d_materialise_unique. And I doubt d_splice_alias was ever completely > > correct even for on-disk filesystems. > > > > Am I missing some subtlety? > > One subtle difference is that for a non-directory d_splice_alias() will > reconnect a DCACHE_DISCONNECTED dentry if one exists, while > d_materialise_unique() will not. Actually until f80de2cde10350b8d146e375ff8b634e72e6a827 "dcache: don't clear DCACHE_DISCONNECTED too early", it was the reverse: d_materialise_unique cleared DISCONNECTED and d_splice_alias (correctly) did not. The only place where it should be cleared is reconnect_path(). > Does this matter in practice? The small number of extra dentries > probably does not matter. Directories are assumed to have unique aliases. When they don't, the kernel can deadlock or crash. --b. ^ permalink raw reply [flat|nested] 21+ messages in thread
[parent not found: <20140115175723.GA4596-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>]
* Re: [PATCH] dcache: fix d_splice_alias handling of aliases [not found] ` <20140115175723.GA4596-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> @ 2014-01-15 18:25 ` Miklos Szeredi 2014-01-16 15:41 ` J. Bruce Fields 0 siblings, 1 reply; 21+ messages in thread From: Miklos Szeredi @ 2014-01-15 18:25 UTC (permalink / raw) To: J. Bruce Fields Cc: Miklos Szeredi, Al Viro, Linux-Fsdevel, Kernel Mailing List, Linux NFS list On Wed, Jan 15, 2014 at 6:57 PM, J. Bruce Fields <bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> wrote: > On Wed, Jan 15, 2014 at 06:34:56PM +0100, Miklos Szeredi wrote: >> On Wed, 2014-01-15 at 10:17 -0500, J. Bruce Fields wrote: >> > From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> >> > >> > d_splice_alias can create duplicate directory aliases (in the !new >> > case), or (in the new case) d_move without holding appropriate locks. >> >> It can d_move, because the dentry is known to be disconnected, i.e. it >> doesn't have a parent for which we could obtain the lock. > > DCACHE_DISCONNECTED doesn't mean that. You're right, but I'm also right, because __d_find_alias() will check IS_ROOT() too. So only "root" disconnected dentries will be moved. > > When you lookup a dentry by filehandle that dentry is initially marked > DCACHE_DISCONNECTED. It is cleared only after reconnect_path() has > verified that the dentry is reachable all the way from the root. > > So !DCACHE_DISCONNECTED implies that the dentry is connected all the way > up to the root, but the converse is not true. > > This has been a source of confusion, but it is explained in > Documentation/filesystems/nfs/Exporting. Recently I've cleaned up a few > odd uses of DCACHE_DISCONNECTED and rewritten reconnect_path(), partly > as an attempt to clarify the situation. > > Let me know if any of that doesn't look right to you.... > >> > d_materialise_unique deals with both of these problems. (The latter >> > seems to be dealt by trylocks (see __d_unalias), which look like they >> > could cause spurious lookup failures--but that's at least better tha >> > corrupting the dcache.) >> > >> > Signed-off-by: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> >> > --- >> > fs/dcache.c | 25 +------------------------ >> > 1 file changed, 1 insertion(+), 24 deletions(-) >> > >> > Only lightly tested.... If this is right, then we can also just ditch >> > d_splice_alias completely, and clean up the various d_find_alias's. >> > >> > I think the only reason we have both d_splice_alias and >> > d_materialise_unique is that the former was written for exportable >> > filesystems and the latter for distributed filesystems. >> > >> > But we have at least one exportable filesystem (fuse) using >> > d_materialise_unique. And I doubt d_splice_alias was ever completely >> > correct even for on-disk filesystems. >> > >> > Am I missing some subtlety? >> >> One subtle difference is that for a non-directory d_splice_alias() will >> reconnect a DCACHE_DISCONNECTED dentry if one exists, while >> d_materialise_unique() will not. > > Actually until f80de2cde10350b8d146e375ff8b634e72e6a827 "dcache: don't > clear DCACHE_DISCONNECTED too early", it was the reverse: > d_materialise_unique cleared DISCONNECTED and d_splice_alias (correctly) > did not. > > The only place where it should be cleared is reconnect_path(). > >> Does this matter in practice? The small number of extra dentries >> probably does not matter. > > Directories are assumed to have unique aliases. When they don't, the > kernel can deadlock or crash. What I meant is that d_materialise_unique() will currently not reuse disconnected *nondirectory* dentries, hence there may be more aliases than necessary. This could easily be fixed, though. Thanks, Miklos > > --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] dcache: fix d_splice_alias handling of aliases 2014-01-15 18:25 ` Miklos Szeredi @ 2014-01-16 15:41 ` J. Bruce Fields [not found] ` <20140116154132.GB16829-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> 0 siblings, 1 reply; 21+ messages in thread From: J. Bruce Fields @ 2014-01-16 15:41 UTC (permalink / raw) To: Miklos Szeredi Cc: Miklos Szeredi, Al Viro, Linux-Fsdevel, Kernel Mailing List, Linux NFS list On Wed, Jan 15, 2014 at 07:25:11PM +0100, Miklos Szeredi wrote: > On Wed, Jan 15, 2014 at 6:57 PM, J. Bruce Fields <bfields@fieldses.org> wrote: > > On Wed, Jan 15, 2014 at 06:34:56PM +0100, Miklos Szeredi wrote: > >> On Wed, 2014-01-15 at 10:17 -0500, J. Bruce Fields wrote: > >> > From: "J. Bruce Fields" <bfields@redhat.com> > >> > > >> > d_splice_alias can create duplicate directory aliases (in the !new > >> > case), or (in the new case) d_move without holding appropriate locks. > >> > >> It can d_move, because the dentry is known to be disconnected, i.e. it > >> doesn't have a parent for which we could obtain the lock. > > > > DCACHE_DISCONNECTED doesn't mean that. > > You're right, but I'm also right, because __d_find_alias() will check > IS_ROOT() too. So only "root" disconnected dentries will be moved. You're right, I forgot that check. > >> One subtle difference is that for a non-directory d_splice_alias() will > >> reconnect a DCACHE_DISCONNECTED dentry if one exists, while > >> d_materialise_unique() will not. ... > >> Does this matter in practice? The small number of extra dentries > >> probably does not matter. > > > > Directories are assumed to have unique aliases. When they don't, the > > kernel can deadlock or crash. > > What I meant is that d_materialise_unique() will currently not reuse > disconnected *nondirectory* dentries, hence there may be more aliases > than necessary. This could easily be fixed, though. And, sorry, I did miss that you said "non-directory". But I think you have that backwards: d_splice_alias looks like: if (inode && S_ISDIR(inode->i_mode)) { ... } else { d_instantiate(dentry, inode); if (d_unhashed(dentry)) d_rehash(dentry); } So it ignores any existing aliases in the non-directory case. d_materialise_unique by contrast calls __d_instantiate_unique, which looks like it should avoid adding duplicates. So I think switching everyone to d_materialiase_unique would result in fewer dentries. But I've never seen any complaint about the issue and like you don't see a reason this would matter much either way. --b. ^ permalink raw reply [flat|nested] 21+ messages in thread
[parent not found: <20140116154132.GB16829-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>]
* Re: [PATCH] dcache: fix d_splice_alias handling of aliases [not found] ` <20140116154132.GB16829-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> @ 2014-01-16 16:13 ` Miklos Szeredi 0 siblings, 0 replies; 21+ messages in thread From: Miklos Szeredi @ 2014-01-16 16:13 UTC (permalink / raw) To: J. Bruce Fields Cc: Miklos Szeredi, Al Viro, Linux-Fsdevel, Kernel Mailing List, Linux NFS list On Thu, Jan 16, 2014 at 4:41 PM, J. Bruce Fields <bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> wrote: > And, sorry, I did miss that you said "non-directory". But I think you > have that backwards: d_splice_alias looks like: > > if (inode && S_ISDIR(inode->i_mode)) { > ... > } else { > d_instantiate(dentry, inode); > if (d_unhashed(dentry)) > d_rehash(dentry); > } > > So it ignores any existing aliases in the non-directory case. Okay. > > d_materialise_unique by contrast calls __d_instantiate_unique, which > looks like it should avoid adding duplicates. > > So I think switching everyone to d_materialiase_unique would result in > fewer dentries. But I've never seen any complaint about the issue and > like you don't see a reason this would matter much either way. So, yes, d_materialise_unique() looks like it has superior functionality compared to d_splice_alias(). Thanks, Miklos -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 21+ messages in thread
[parent not found: <20140115151749.GF23999-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>]
* Re: [PATCH] dcache: fix d_splice_alias handling of aliases [not found] ` <20140115151749.GF23999-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> @ 2014-01-16 16:10 ` J. Bruce Fields 2014-01-16 16:15 ` Steven Whitehouse 0 siblings, 1 reply; 21+ messages in thread From: J. Bruce Fields @ 2014-01-16 16:10 UTC (permalink / raw) To: Al Viro Cc: linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-nfs-u79uwXL29TY76Z2rM5mHXA, Miklos Szeredi, Steven Whitehouse On Wed, Jan 15, 2014 at 10:17:49AM -0500, bfields wrote: > From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> > > d_splice_alias can create duplicate directory aliases (in the !new > case), or (in the new case) d_move without holding appropriate locks. > > d_materialise_unique deals with both of these problems. (The latter > seems to be dealt by trylocks (see __d_unalias), which look like they > could cause spurious lookup failures--but that's at least better than > corrupting the dcache.) > > Signed-off-by: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> > --- > fs/dcache.c | 25 +------------------------ > 1 file changed, 1 insertion(+), 24 deletions(-) > > Only lightly tested.... If this is right, then we can also just ditch > d_splice_alias completely, and clean up the various d_find_alias's. > > I think the only reason we have both d_splice_alias and > d_materialise_unique is that the former was written for exportable > filesystems and the latter for distributed filesystems. > > But we have at least one exportable filesystem (fuse) using > d_materialise_unique. And I doubt d_splice_alias was ever completely > correct even for on-disk filesystems. > > Am I missing some subtlety? Hm, I just noticed: commit 0d0d110720d7960b77c03c9f2597faaff4b484ae Author: Miklos Szeredi <mszeredi-AlSwsSmVLrQ@public.gmane.org> Date: Mon Sep 16 14:52:00 2013 +0200 GFS2: d_splice_alias() can't return error unless it was given an IS_ERR(inode), which isn't the case here. So clean up the unnecessary error handling in gfs2_create_inode(). This paves the way for real fixes (hence the stable Cc). Signed-off-by: Miklos Szeredi <mszeredi-AlSwsSmVLrQ@public.gmane.org> Signed-off-by: Steven Whitehouse <swhiteho-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> Cc: stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org While the statement is true for the current implementation of d_splice_alias, I don't think it's actually true for any correct implementation of d_splice_alias, which must be able to return at least -ELOOP in the directory case. Does gfs2 need fixing? --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] dcache: fix d_splice_alias handling of aliases 2014-01-16 16:10 ` J. Bruce Fields @ 2014-01-16 16:15 ` Steven Whitehouse 2014-01-16 16:44 ` J. Bruce Fields 0 siblings, 1 reply; 21+ messages in thread From: Steven Whitehouse @ 2014-01-16 16:15 UTC (permalink / raw) To: J. Bruce Fields Cc: Al Viro, linux-fsdevel, linux-kernel, linux-nfs, Miklos Szeredi Hi, On Thu, 2014-01-16 at 11:10 -0500, J. Bruce Fields wrote: > On Wed, Jan 15, 2014 at 10:17:49AM -0500, bfields wrote: > > From: "J. Bruce Fields" <bfields@redhat.com> > > > > d_splice_alias can create duplicate directory aliases (in the !new > > case), or (in the new case) d_move without holding appropriate locks. > > > > d_materialise_unique deals with both of these problems. (The latter > > seems to be dealt by trylocks (see __d_unalias), which look like they > > could cause spurious lookup failures--but that's at least better than > > corrupting the dcache.) > > > > Signed-off-by: J. Bruce Fields <bfields@redhat.com> > > --- > > fs/dcache.c | 25 +------------------------ > > 1 file changed, 1 insertion(+), 24 deletions(-) > > > > Only lightly tested.... If this is right, then we can also just ditch > > d_splice_alias completely, and clean up the various d_find_alias's. > > > > I think the only reason we have both d_splice_alias and > > d_materialise_unique is that the former was written for exportable > > filesystems and the latter for distributed filesystems. > > > > But we have at least one exportable filesystem (fuse) using > > d_materialise_unique. And I doubt d_splice_alias was ever completely > > correct even for on-disk filesystems. > > > > Am I missing some subtlety? > > Hm, I just noticed: > > commit 0d0d110720d7960b77c03c9f2597faaff4b484ae > Author: Miklos Szeredi <mszeredi@suse.cz> > Date: Mon Sep 16 14:52:00 2013 +0200 > > GFS2: d_splice_alias() can't return error > > unless it was given an IS_ERR(inode), which isn't the case here. So clean > up the unnecessary error handling in gfs2_create_inode(). > > This paves the way for real fixes (hence the stable Cc). > > Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> > Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> > Cc: stable@vger.kernel.org > > While the statement is true for the current implementation of > d_splice_alias, I don't think it's actually true for any correct > implementation of d_splice_alias, which must be able to return at least > -ELOOP in the directory case. Does gfs2 need fixing? > > --b. Yes, in that case, probably in two places, Steve. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] dcache: fix d_splice_alias handling of aliases 2014-01-16 16:15 ` Steven Whitehouse @ 2014-01-16 16:44 ` J. Bruce Fields 2014-01-16 16:54 ` Bob Peterson 0 siblings, 1 reply; 21+ messages in thread From: J. Bruce Fields @ 2014-01-16 16:44 UTC (permalink / raw) To: Steven Whitehouse Cc: Al Viro, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-nfs-u79uwXL29TY76Z2rM5mHXA, Miklos Szeredi On Thu, Jan 16, 2014 at 04:15:42PM +0000, Steven Whitehouse wrote: > Hi, > > On Thu, 2014-01-16 at 11:10 -0500, J. Bruce Fields wrote: > > On Wed, Jan 15, 2014 at 10:17:49AM -0500, bfields wrote: > > > From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> > > > > > > d_splice_alias can create duplicate directory aliases (in the !new > > > case), or (in the new case) d_move without holding appropriate locks. > > > > > > d_materialise_unique deals with both of these problems. (The latter > > > seems to be dealt by trylocks (see __d_unalias), which look like they > > > could cause spurious lookup failures--but that's at least better than > > > corrupting the dcache.) > > > > > > Signed-off-by: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> > > > --- > > > fs/dcache.c | 25 +------------------------ > > > 1 file changed, 1 insertion(+), 24 deletions(-) > > > > > > Only lightly tested.... If this is right, then we can also just ditch > > > d_splice_alias completely, and clean up the various d_find_alias's. > > > > > > I think the only reason we have both d_splice_alias and > > > d_materialise_unique is that the former was written for exportable > > > filesystems and the latter for distributed filesystems. > > > > > > But we have at least one exportable filesystem (fuse) using > > > d_materialise_unique. And I doubt d_splice_alias was ever completely > > > correct even for on-disk filesystems. > > > > > > Am I missing some subtlety? > > > > Hm, I just noticed: > > > > commit 0d0d110720d7960b77c03c9f2597faaff4b484ae > > Author: Miklos Szeredi <mszeredi-AlSwsSmVLrQ@public.gmane.org> > > Date: Mon Sep 16 14:52:00 2013 +0200 > > > > GFS2: d_splice_alias() can't return error > > > > unless it was given an IS_ERR(inode), which isn't the case here. So clean > > up the unnecessary error handling in gfs2_create_inode(). > > > > This paves the way for real fixes (hence the stable Cc). > > > > Signed-off-by: Miklos Szeredi <mszeredi-AlSwsSmVLrQ@public.gmane.org> > > Signed-off-by: Steven Whitehouse <swhiteho-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> > > Cc: stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > > > > While the statement is true for the current implementation of > > d_splice_alias, I don't think it's actually true for any correct > > implementation of d_splice_alias, which must be able to return at least > > -ELOOP in the directory case. Does gfs2 need fixing? > > > > --b. > > Yes, in that case, probably in two places, Something like this? (Except: is the inode cleanup right in the first chunk? And in the second chunk the cleanup could maybe be organized better even if I got it right....) --b. diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c index 7119504..19e0924 100644 --- a/fs/gfs2/inode.c +++ b/fs/gfs2/inode.c @@ -585,6 +585,9 @@ static int gfs2_create_inode(struct inode *dir, struct dentry *dentry, error = PTR_ERR(inode); if (!IS_ERR(inode)) { d = d_splice_alias(inode, dentry); + error = PTR_ERR(d); + if (IS_ERR(d)) + goto fail_gunlock; error = 0; if (file) { if (S_ISREG(inode->i_mode)) { @@ -779,6 +782,11 @@ static struct dentry *__gfs2_lookup(struct inode *dir, struct dentry *dentry, } d = d_splice_alias(inode, dentry); + if (IS_ERR(d)) { + iput(inode); + gfs2_glock_dq_uninit(&gh); + return ERR_PTR(error); + } if (file && S_ISREG(inode->i_mode)) error = finish_open(file, dentry, gfs2_open_common, opened); -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH] dcache: fix d_splice_alias handling of aliases 2014-01-16 16:44 ` J. Bruce Fields @ 2014-01-16 16:54 ` Bob Peterson 2014-01-16 18:51 ` J. Bruce Fields 0 siblings, 1 reply; 21+ messages in thread From: Bob Peterson @ 2014-01-16 16:54 UTC (permalink / raw) To: J. Bruce Fields Cc: Steven Whitehouse, Al Viro, linux-fsdevel, linux-kernel, linux-nfs, Miklos Szeredi ----- Original Message ----- | Something like this? (snip) | @@ -779,6 +782,11 @@ static struct dentry *__gfs2_lookup(struct inode *dir, | struct dentry *dentry, | } | | d = d_splice_alias(inode, dentry); | + if (IS_ERR(d)) { | + iput(inode); | + gfs2_glock_dq_uninit(&gh); | + return ERR_PTR(error); ---------------------------------^ Shouldn't that be ERR_PTR(d)? Bob Peterson Red Hat File Systems ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] dcache: fix d_splice_alias handling of aliases 2014-01-16 16:54 ` Bob Peterson @ 2014-01-16 18:51 ` J. Bruce Fields 2014-01-17 10:04 ` Steven Whitehouse 0 siblings, 1 reply; 21+ messages in thread From: J. Bruce Fields @ 2014-01-16 18:51 UTC (permalink / raw) To: Bob Peterson Cc: Steven Whitehouse, Al Viro, linux-fsdevel, linux-kernel, linux-nfs, Miklos Szeredi On Thu, Jan 16, 2014 at 11:54:07AM -0500, Bob Peterson wrote: > ----- Original Message ----- > | Something like this? > (snip) > | @@ -779,6 +782,11 @@ static struct dentry *__gfs2_lookup(struct inode *dir, > | struct dentry *dentry, > | } > | > | d = d_splice_alias(inode, dentry); > | + if (IS_ERR(d)) { > | + iput(inode); > | + gfs2_glock_dq_uninit(&gh); > | + return ERR_PTR(error); > > ---------------------------------^ > Shouldn't that be ERR_PTR(d)? Oops, yeah--well, actually just "d" I guess. This is what I've got for what it's worth. --b. commit 6fba5295019b52a03d76c9e9570952466051a7a6 Author: J. Bruce Fields <bfields@redhat.com> Date: Thu Jan 16 11:44:53 2014 -0500 gfs2: revert "GFS2: d_splice_alias() can't return error" 0d0d110720d7960b77c03c9f2597faaff4b484ae asserts that "d_splice_alias() can't return error unless it was given an IS_ERR(inode)". That was true of the implementation of d_splice_alias, but this is really a problem with d_splice_alias: at a minimum it should be able to return -ELOOP in the case where inserting the given dentry would cause a directory loop. Signed-off-by: J. Bruce Fields <bfields@redhat.com> diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c index 7119504..3f44902 100644 --- a/fs/gfs2/inode.c +++ b/fs/gfs2/inode.c @@ -585,6 +585,9 @@ static int gfs2_create_inode(struct inode *dir, struct dentry *dentry, error = PTR_ERR(inode); if (!IS_ERR(inode)) { d = d_splice_alias(inode, dentry); + error = PTR_ERR(d); + if (IS_ERR(d)) + goto fail_gunlock; error = 0; if (file) { if (S_ISREG(inode->i_mode)) { @@ -779,6 +782,11 @@ static struct dentry *__gfs2_lookup(struct inode *dir, struct dentry *dentry, } d = d_splice_alias(inode, dentry); + if (IS_ERR(d)) { + iput(inode); + gfs2_glock_dq_uninit(&gh); + return d; + } if (file && S_ISREG(inode->i_mode)) error = finish_open(file, dentry, gfs2_open_common, opened); ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH] dcache: fix d_splice_alias handling of aliases 2014-01-16 18:51 ` J. Bruce Fields @ 2014-01-17 10:04 ` Steven Whitehouse 2014-01-17 18:04 ` J. Bruce Fields 0 siblings, 1 reply; 21+ messages in thread From: Steven Whitehouse @ 2014-01-17 10:04 UTC (permalink / raw) To: J. Bruce Fields Cc: Bob Peterson, Al Viro, linux-fsdevel, linux-kernel, linux-nfs, Miklos Szeredi Hi, On Thu, 2014-01-16 at 13:51 -0500, J. Bruce Fields wrote: > On Thu, Jan 16, 2014 at 11:54:07AM -0500, Bob Peterson wrote: > > ----- Original Message ----- > > | Something like this? > > (snip) > > | @@ -779,6 +782,11 @@ static struct dentry *__gfs2_lookup(struct inode *dir, > > | struct dentry *dentry, > > | } > > | > > | d = d_splice_alias(inode, dentry); > > | + if (IS_ERR(d)) { > > | + iput(inode); > > | + gfs2_glock_dq_uninit(&gh); > > | + return ERR_PTR(error); > > > > ---------------------------------^ > > Shouldn't that be ERR_PTR(d)? > > Oops, yeah--well, actually just "d" I guess. > > This is what I've got for what it's worth. > > --b. > I think the patch looks ok to me. Did you give it a spin to test it? If so then I will put it in the GFS2 -nmw tree, Steve. > commit 6fba5295019b52a03d76c9e9570952466051a7a6 > Author: J. Bruce Fields <bfields@redhat.com> > Date: Thu Jan 16 11:44:53 2014 -0500 > > gfs2: revert "GFS2: d_splice_alias() can't return error" > > 0d0d110720d7960b77c03c9f2597faaff4b484ae asserts that "d_splice_alias() > can't return error unless it was given an IS_ERR(inode)". > > That was true of the implementation of d_splice_alias, but this is > really a problem with d_splice_alias: at a minimum it should be able to > return -ELOOP in the case where inserting the given dentry would cause a > directory loop. > > Signed-off-by: J. Bruce Fields <bfields@redhat.com> > > diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c > index 7119504..3f44902 100644 > --- a/fs/gfs2/inode.c > +++ b/fs/gfs2/inode.c > @@ -585,6 +585,9 @@ static int gfs2_create_inode(struct inode *dir, struct dentry *dentry, > error = PTR_ERR(inode); > if (!IS_ERR(inode)) { > d = d_splice_alias(inode, dentry); > + error = PTR_ERR(d); > + if (IS_ERR(d)) > + goto fail_gunlock; > error = 0; > if (file) { > if (S_ISREG(inode->i_mode)) { > @@ -779,6 +782,11 @@ static struct dentry *__gfs2_lookup(struct inode *dir, struct dentry *dentry, > } > > d = d_splice_alias(inode, dentry); > + if (IS_ERR(d)) { > + iput(inode); > + gfs2_glock_dq_uninit(&gh); > + return d; > + } > if (file && S_ISREG(inode->i_mode)) > error = finish_open(file, dentry, gfs2_open_common, opened); > ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] dcache: fix d_splice_alias handling of aliases 2014-01-17 10:04 ` Steven Whitehouse @ 2014-01-17 18:04 ` J. Bruce Fields 0 siblings, 0 replies; 21+ messages in thread From: J. Bruce Fields @ 2014-01-17 18:04 UTC (permalink / raw) To: Steven Whitehouse Cc: Bob Peterson, Al Viro, linux-fsdevel, linux-kernel, linux-nfs, Miklos Szeredi On Fri, Jan 17, 2014 at 10:04:30AM +0000, Steven Whitehouse wrote: > Hi, > > On Thu, 2014-01-16 at 13:51 -0500, J. Bruce Fields wrote: > > On Thu, Jan 16, 2014 at 11:54:07AM -0500, Bob Peterson wrote: > > > ----- Original Message ----- > > > | Something like this? > > > (snip) > > > | @@ -779,6 +782,11 @@ static struct dentry *__gfs2_lookup(struct inode *dir, > > > | struct dentry *dentry, > > > | } > > > | > > > | d = d_splice_alias(inode, dentry); > > > | + if (IS_ERR(d)) { > > > | + iput(inode); > > > | + gfs2_glock_dq_uninit(&gh); > > > | + return ERR_PTR(error); > > > > > > ---------------------------------^ > > > Shouldn't that be ERR_PTR(d)? > > > > Oops, yeah--well, actually just "d" I guess. > > > > This is what I've got for what it's worth. > > > > --b. > > > > I think the patch looks ok to me. Did you give it a spin to test it? Sure, just mkfs.gfs2 -p lock_nolock /dev/vdb mount /dev/vdb /mnt/ cd cthon NFSTESTDIR=/mnt/TMP ./runtests -a -f Which doesn't test much of anything (we'd need to inject artificial errors for now to test the error paths), but at least it doesn't blow up in some obvious way.... > If so then I will put it in the GFS2 -nmw tree, Thanks! --b. > > Steve. > > > commit 6fba5295019b52a03d76c9e9570952466051a7a6 > > Author: J. Bruce Fields <bfields@redhat.com> > > Date: Thu Jan 16 11:44:53 2014 -0500 > > > > gfs2: revert "GFS2: d_splice_alias() can't return error" > > > > 0d0d110720d7960b77c03c9f2597faaff4b484ae asserts that "d_splice_alias() > > can't return error unless it was given an IS_ERR(inode)". > > > > That was true of the implementation of d_splice_alias, but this is > > really a problem with d_splice_alias: at a minimum it should be able to > > return -ELOOP in the case where inserting the given dentry would cause a > > directory loop. > > > > Signed-off-by: J. Bruce Fields <bfields@redhat.com> > > > > diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c > > index 7119504..3f44902 100644 > > --- a/fs/gfs2/inode.c > > +++ b/fs/gfs2/inode.c > > @@ -585,6 +585,9 @@ static int gfs2_create_inode(struct inode *dir, struct dentry *dentry, > > error = PTR_ERR(inode); > > if (!IS_ERR(inode)) { > > d = d_splice_alias(inode, dentry); > > + error = PTR_ERR(d); > > + if (IS_ERR(d)) > > + goto fail_gunlock; > > error = 0; > > if (file) { > > if (S_ISREG(inode->i_mode)) { > > @@ -779,6 +782,11 @@ static struct dentry *__gfs2_lookup(struct inode *dir, struct dentry *dentry, > > } > > > > d = d_splice_alias(inode, dentry); > > + if (IS_ERR(d)) { > > + iput(inode); > > + gfs2_glock_dq_uninit(&gh); > > + return d; > > + } > > if (file && S_ISREG(inode->i_mode)) > > error = finish_open(file, dentry, gfs2_open_common, opened); > > > > ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] dcache: fix d_splice_alias handling of aliases 2014-01-15 15:17 [PATCH] dcache: fix d_splice_alias handling of aliases J. Bruce Fields 2014-01-15 17:34 ` Miklos Szeredi [not found] ` <20140115151749.GF23999-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> @ 2014-01-17 12:17 ` Christoph Hellwig [not found] ` <20140117121723.GA18375-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> 2 siblings, 1 reply; 21+ messages in thread From: Christoph Hellwig @ 2014-01-17 12:17 UTC (permalink / raw) To: J. Bruce Fields Cc: Al Viro, linux-fsdevel, linux-kernel, linux-nfs, Miklos Szeredi On Wed, Jan 15, 2014 at 10:17:49AM -0500, J. Bruce Fields wrote: > From: "J. Bruce Fields" <bfields@redhat.com> > > d_splice_alias can create duplicate directory aliases (in the !new > case), or (in the new case) d_move without holding appropriate locks. > > d_materialise_unique deals with both of these problems. (The latter > seems to be dealt by trylocks (see __d_unalias), which look like they > could cause spurious lookup failures--but that's at least better than > corrupting the dcache.) I'm a bit worried about those spurious failures, maybe we should retry in that case? Also looking over the changes I wonder if the explicit cecking for aliases for every non-directory might have a major performance impact, all the dcache growling already was a major issues in NFS workloads years ago and I dumb it's become any better. Also looking at this area I'd like to suggest that if you end up merging the two I'd continue using the d_splice_alias name and calling conventions. Also the inode == NULL case really should be split out from d_materialise_unique into a separate helper. It shares almost no code, is entirely undocumented to the point that I don't really understand what the purpose is, and the only caller that can get there (fuse) already branches around that case in the caller anyway. ^ permalink raw reply [flat|nested] 21+ messages in thread
[parent not found: <20140117121723.GA18375-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>]
* Re: [PATCH] dcache: fix d_splice_alias handling of aliases [not found] ` <20140117121723.GA18375-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> @ 2014-01-17 15:39 ` J. Bruce Fields [not found] ` <20140117153917.GA26636-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> 0 siblings, 1 reply; 21+ messages in thread From: J. Bruce Fields @ 2014-01-17 15:39 UTC (permalink / raw) To: Christoph Hellwig Cc: Al Viro, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-nfs-u79uwXL29TY76Z2rM5mHXA, Miklos Szeredi On Fri, Jan 17, 2014 at 04:17:23AM -0800, Christoph Hellwig wrote: > On Wed, Jan 15, 2014 at 10:17:49AM -0500, J. Bruce Fields wrote: > > From: "J. Bruce Fields" <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> > > > > d_splice_alias can create duplicate directory aliases (in the !new > > case), or (in the new case) d_move without holding appropriate locks. > > > > d_materialise_unique deals with both of these problems. (The latter > > seems to be dealt by trylocks (see __d_unalias), which look like they > > could cause spurious lookup failures--but that's at least better than > > corrupting the dcache.) > > I'm a bit worried about those spurious failures, maybe we should > retry in that case? Maybe so. I'm not sure how. d_materialise_unique is called from lookup and we'd need to at least drop the parent i_mutex to give a concurrent rename a chance to progress. I think NFS or cluster filesystem clients could hit this case with: host A host B --------- ------------------------- process 1 process 1 process 2 --------- --------- --------- mkdir foo/X mv foo/X bar/ stat bar/X mv baz qux When (B,1) looks up X in bar it finds that X still has an alias in foo, tries to rename that alias to bar/X, but can't because the current baz->qux rename is holding the rename mutex. So __d_unalias and the lookup return -EBUSY. None of those operations are particularly fast, so I'm a bit surprised we haven't already heard complaints. I must be missing some reason this doesn't happen. I guess I should set up a test. > Also looking over the changes I wonder if the explicit cecking for > aliases for every non-directory might have a major performance impact, > all the dcache growling already was a major issues in NFS workloads > years ago and I dumb it's become any better. This only happens on the first (uncached) lookup. So we've already acquired a bunch of locks and probably done a round trip to a disk or a server--is walking a (typically short) list really something to worry about? > Also looking at this area I'd like to suggest that if you end up > merging the two I'd continue using the d_splice_alias name and > calling conventions. OK, I guess I don't care which one we keep. > Also the inode == NULL case really should be split out from > d_materialise_unique into a separate helper. It shares almost no > code, is entirely undocumented to the point that I don't really > understand what the purpose is, and the only caller that can get > there (fuse) already branches around that case in the caller anyway. I think I see what you mean, I can fix that. --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 21+ messages in thread
[parent not found: <20140117153917.GA26636-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>]
* Re: [PATCH] dcache: fix d_splice_alias handling of aliases [not found] ` <20140117153917.GA26636-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> @ 2014-01-17 21:03 ` J. Bruce Fields [not found] ` <20140117210343.GD26636-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> 0 siblings, 1 reply; 21+ messages in thread From: J. Bruce Fields @ 2014-01-17 21:03 UTC (permalink / raw) To: Christoph Hellwig Cc: Al Viro, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-nfs-u79uwXL29TY76Z2rM5mHXA, Miklos Szeredi On Fri, Jan 17, 2014 at 10:39:17AM -0500, J. Bruce Fields wrote: > On Fri, Jan 17, 2014 at 04:17:23AM -0800, Christoph Hellwig wrote: > > Also the inode == NULL case really should be split out from > > d_materialise_unique into a separate helper. It shares almost no > > code, is entirely undocumented to the point that I don't really > > understand what the purpose is, and the only caller that can get > > there (fuse) already branches around that case in the caller anyway. > > I think I see what you mean, I can fix that. Actually: - two callers (fuse and nfs) take advantage of the NULL case. - d_splice_alias handles inode == NULL in the same way, and almost every caller takes advantage of that. So at least we wouldn't want to actually make the caller handle this case. But maybe there's still some opportunity for cleanup or documentation. --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 21+ messages in thread
[parent not found: <20140117210343.GD26636-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>]
* Re: [PATCH] dcache: fix d_splice_alias handling of aliases [not found] ` <20140117210343.GD26636-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> @ 2014-01-17 21:26 ` J. Bruce Fields 2014-01-23 21:27 ` [PATCH] dcache: make d_splice_alias use d_materialise_unique J. Bruce Fields 0 siblings, 1 reply; 21+ messages in thread From: J. Bruce Fields @ 2014-01-17 21:26 UTC (permalink / raw) To: Christoph Hellwig Cc: Al Viro, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-nfs-u79uwXL29TY76Z2rM5mHXA, Miklos Szeredi On Fri, Jan 17, 2014 at 04:03:43PM -0500, J. Bruce Fields wrote: > - d_splice_alias handles inode == NULL in the same way, Actually, not exactly; simplifying a bit, in the NULL case they do: d_splice_alias: __d_instantiate(dentry, NULL); security_d_instantiate(dentry, NULL); if (d_unhashed(dentry)) d_rehash(dentry); d_materialise_unique: BUG_ON(!d_unhashed(dentry)); __d_instantiate(dentry, NULL); d_rehash(dentry); security_d_instantiate(dentry, NULL); and a comment on d_splice_alias says Cluster filesystems may call this function with a negative, hashed dentry. In that case, we know that the inode will be a regular file, and also this will only occur during atomic_open. I don't understand those callers. But I guess it would be easy enough to handle in d_materialise_unique. --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH] dcache: make d_splice_alias use d_materialise_unique 2014-01-17 21:26 ` J. Bruce Fields @ 2014-01-23 21:27 ` J. Bruce Fields 2014-01-31 18:42 ` Al Viro 0 siblings, 1 reply; 21+ messages in thread From: J. Bruce Fields @ 2014-01-23 21:27 UTC (permalink / raw) To: Christoph Hellwig Cc: Al Viro, linux-fsdevel, linux-kernel, linux-nfs, Miklos Szeredi From: "J. Bruce Fields" <bfields@redhat.com> d_splice_alias can create duplicate directory aliases (in the !new case), or (in the new case) d_move directories without holding appropriate locks. d_materialise_unique deals with both of these problems. (The latter seems to be dealt by trylocks (see __d_unalias), which look like they could cause spurious lookup failures--but that's at least better than corrupting the dcache.) We have to fix up a couple minor differences between d_splice_alias and d_materialise_unique: - d_splice_alias also handles IS_ERR(inode) - d_splice_alias allows already-hashed dentries in one special case. We keep the d_splice_alias name and calling convention and deprecate d_materialise_unique, which has fewer callers. Also add some documentation. Signed-off-by: J. Bruce Fields <bfields@redhat.com> --- fs/dcache.c | 96 ++++++++++++++++++++++------------------------------------- 1 file changed, 35 insertions(+), 61 deletions(-) Here's a revised patch. If it looks reasonable then there are some further minor simplifications possible. (See git://linux-nfs.org/~bfields/linux-topics.git for-viro.) diff --git a/fs/dcache.c b/fs/dcache.c index 4bdb300..ec43194 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -1765,7 +1765,7 @@ EXPORT_SYMBOL(d_instantiate_unique); * @inode: inode to attach to this dentry * * Fill in inode information in the entry. If a directory alias is found, then - * return an error (and drop inode). Together with d_materialise_unique() this + * return an error (and drop inode). Together with d_splice_alias() this * guarantees that a directory inode may never have more than one alias. */ int d_instantiate_no_diralias(struct dentry *entry, struct inode *inode) @@ -1905,58 +1905,6 @@ struct dentry *d_obtain_alias(struct inode *inode) EXPORT_SYMBOL(d_obtain_alias); /** - * d_splice_alias - splice a disconnected dentry into the tree if one exists - * @inode: the inode which may have a disconnected dentry - * @dentry: a negative dentry which we want to point to the inode. - * - * If inode is a directory and has a 'disconnected' dentry (i.e. IS_ROOT and - * DCACHE_DISCONNECTED), then d_move that in place of the given dentry - * and return it, else simply d_add the inode to the dentry and return NULL. - * - * This is needed in the lookup routine of any filesystem that is exportable - * (via knfsd) so that we can build dcache paths to directories effectively. - * - * If a dentry was found and moved, then it is returned. Otherwise NULL - * is returned. This matches the expected return value of ->lookup. - * - * Cluster filesystems may call this function with a negative, hashed dentry. - * In that case, we know that the inode will be a regular file, and also this - * will only occur during atomic_open. So we need to check for the dentry - * being already hashed only in the final case. - */ -struct dentry *d_splice_alias(struct inode *inode, struct dentry *dentry) -{ - struct dentry *new = NULL; - - if (IS_ERR(inode)) - return ERR_CAST(inode); - - if (inode && S_ISDIR(inode->i_mode)) { - spin_lock(&inode->i_lock); - new = __d_find_alias(inode, 1); - if (new) { - BUG_ON(!(new->d_flags & DCACHE_DISCONNECTED)); - spin_unlock(&inode->i_lock); - security_d_instantiate(new, inode); - d_move(new, dentry); - iput(inode); - } else { - /* already taking inode->i_lock, so d_add() by hand */ - __d_instantiate(dentry, inode); - spin_unlock(&inode->i_lock); - security_d_instantiate(dentry, inode); - d_rehash(dentry); - } - } else { - d_instantiate(dentry, inode); - if (d_unhashed(dentry)) - d_rehash(dentry); - } - return new; -} -EXPORT_SYMBOL(d_splice_alias); - -/** * d_add_ci - lookup or allocate new dentry with case-exact name * @inode: the inode case-insensitive lookup has found * @dentry: the negative dentry that was passed to the parent's lookup func @@ -2716,19 +2664,37 @@ static void __d_materialise_dentry(struct dentry *dentry, struct dentry *anon) } /** - * d_materialise_unique - introduce an inode into the tree - * @dentry: candidate dentry + * d_splice_alias - introduce an inode into the tree * @inode: inode to bind to the dentry, to which aliases may be attached + * @dentry: candidate dentry + * + * Introduces a dentry into the tree, using either the provided dentry + * or, if appropriate, a preexisting alias for the same inode. Caller must + * hold the i_mutex of the parent directory. + * + * The inode may also be an error or NULL; in the former case the error is + * just passed through, in the latter we hash and instantiate the negative + * dentry. This allows filesystems to use d_splice_alias as the + * unconditional last step of their lookup method. + * + * d_splice_alias guarantees that directories will continue to have at + * most one alias, by moving an existing alias if necessary. If doing + * so would create a directory loop, it will fail with -ELOOP. + * + * d_splice_alias makes no such guarantee for files, but may still + * use a preexisting alias when it's convenient. * - * Introduces an dentry into the tree, substituting an extant disconnected - * root directory alias in its place if there is one. Caller must hold the - * i_mutex of the parent directory. + * Note that d_splice_alias normally expects an unhashed dentry, the single + * exception being that cluster filesystems may call this function + * during atomic_open with a negative hashed dentry, and with inode a + * regular file. */ -struct dentry *d_materialise_unique(struct dentry *dentry, struct inode *inode) +struct dentry *d_splice_alias(struct inode *inode, struct dentry *dentry) { struct dentry *actual; - BUG_ON(!d_unhashed(dentry)); + if (IS_ERR(inode)) + return ERR_CAST(inode); if (!inode) { actual = dentry; @@ -2788,7 +2754,8 @@ struct dentry *d_materialise_unique(struct dentry *dentry, struct inode *inode) spin_lock(&actual->d_lock); found: - _d_rehash(actual); + if (d_unhashed(actual)) + _d_rehash(actual); spin_unlock(&actual->d_lock); spin_unlock(&inode->i_lock); out_nolock: @@ -2800,6 +2767,13 @@ out_nolock: iput(inode); return actual; } +EXPORT_SYMBOL(d_splice_alias); + +/* deprecated synonym for d_splice_alias(inode, dentry): */ +struct dentry *d_materialise_unique(struct dentry *dentry, struct inode *inode) +{ + return d_splice_alias(inode, dentry); +} EXPORT_SYMBOL_GPL(d_materialise_unique); static int prepend(char **buffer, int *buflen, const char *str, int namelen) -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH] dcache: make d_splice_alias use d_materialise_unique 2014-01-23 21:27 ` [PATCH] dcache: make d_splice_alias use d_materialise_unique J. Bruce Fields @ 2014-01-31 18:42 ` Al Viro 2014-01-31 19:47 ` J. Bruce Fields 0 siblings, 1 reply; 21+ messages in thread From: Al Viro @ 2014-01-31 18:42 UTC (permalink / raw) To: J. Bruce Fields Cc: Christoph Hellwig, linux-fsdevel, linux-kernel, linux-nfs, Miklos Szeredi On Thu, Jan 23, 2014 at 04:27:00PM -0500, J. Bruce Fields wrote: > From: "J. Bruce Fields" <bfields@redhat.com> > > d_splice_alias can create duplicate directory aliases (in the !new > case), or (in the new case) d_move directories without holding > appropriate locks. Details, please. In the new case, we have IS_ROOT() alias found; what locks would that need? Note that d_materialise_unique() won't bother with __d_unalias() in such case - it does what d_move() would've done, without taking any mutex. In the !new case, we'd need a preexisting dentry alias, complete with parent. IOW, that's the case when directory already in the tree has been found during lookup from another parent. In which case we shouldn't be using d_splice_alias() at all, as it is (and it certainly can't happen for any local fs). Now, I agree that merging that with d_materialise_unique() might be a good idea, but commit message is wrong as it, AFAICS. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] dcache: make d_splice_alias use d_materialise_unique 2014-01-31 18:42 ` Al Viro @ 2014-01-31 19:47 ` J. Bruce Fields [not found] ` <20140131194758.GA24618-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> 0 siblings, 1 reply; 21+ messages in thread From: J. Bruce Fields @ 2014-01-31 19:47 UTC (permalink / raw) To: Al Viro Cc: Christoph Hellwig, linux-fsdevel, linux-kernel, linux-nfs, Miklos Szeredi On Fri, Jan 31, 2014 at 06:42:58PM +0000, Al Viro wrote: > On Thu, Jan 23, 2014 at 04:27:00PM -0500, J. Bruce Fields wrote: > > From: "J. Bruce Fields" <bfields@redhat.com> > > > > d_splice_alias can create duplicate directory aliases (in the !new > > case), or (in the new case) d_move directories without holding > > appropriate locks. > > Details, please. In the new case, we have IS_ROOT() alias found; > what locks would that need? Note that d_materialise_unique() won't > bother with __d_unalias() in such case - it does what d_move() would've > done, without taking any mutex. Of course you're right, and Miklos had pointed this out already and I forgot to update the changelog. Apologies! > In the !new case, we'd need a preexisting dentry alias, complete with > parent. IOW, that's the case when directory already in the tree > has been found during lookup from another parent. In which case > we shouldn't be using d_splice_alias() at all, as it is (and it > certainly can't happen for any local fs). Yes, except: won't a local filesystem will still hit this case on a filesystem that's corrupted to link a directory into multiple parents? Though in that case arguably the right behavior might be, say, WARN and return -EIO. > Now, I agree that merging that with d_materialise_unique() might be > a good idea, but commit message is wrong as it, AFAICS. Agreed, I'll fix and resend. Though now I wonder whether it's worth keeping two different interfaces, one for the case when finding a parent in a different directory is an error and one for the case when it's normal and you'd just like it fixed up. (Then one remaining thing I don't understand is how to make that fixing up reliable. Or is there some reason nobody hits the _EBUSY case of __d_unalias?) --b. ^ permalink raw reply [flat|nested] 21+ messages in thread
[parent not found: <20140131194758.GA24618-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>]
* Re: [PATCH] dcache: make d_splice_alias use d_materialise_unique [not found] ` <20140131194758.GA24618-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> @ 2014-02-06 17:03 ` J. Bruce Fields 0 siblings, 0 replies; 21+ messages in thread From: J. Bruce Fields @ 2014-02-06 17:03 UTC (permalink / raw) To: Al Viro Cc: Christoph Hellwig, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-nfs-u79uwXL29TY76Z2rM5mHXA, Miklos Szeredi, Harshula, hfuchi-H+wXaHxf7aLQT0dZR+AlfA, Trond Myklebust On Fri, Jan 31, 2014 at 02:47:58PM -0500, J. Bruce Fields wrote: > (Then one remaining thing I don't understand is how to make that fixing > up reliable. Or is there some reason nobody hits the _EBUSY case of > __d_unalias?) In fact, a reproducer found thanks to Hideshi Yamaoka: On server: while true; do mv /exports/DIR /exports/TO/DIR mv /exports/TO/DIR /exports/DIR done On client: mount -olookupcache=pos /mnt while true; do ls /mnt/TO; done Also on client: while true; do strace -e open cat /mnt/DIR/test.txt 2>&1 | grep EBUSY done Once all three of those loops are running I hit open("/mnt/DIR/test.txt", O_RDONLY) = -1 EBUSY very quickly. The "lookupcache=pos" isn't really necessary but makes the reproducer more reliable. (Originally this was seen on a single client: the client itself was doing the renames but also continually killing the second mv. I suspect that means the client sends the RENAME but then fails to update its dcache, the result being again that the client's dcache is out of sync with the server's tree and hence lookup is stuck trying to grab a dentry from another directory.) Is there some solution short of making ->lookup callers drop the i_mutex and retry??? --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2014-02-06 17:03 UTC | newest] Thread overview: 21+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-01-15 15:17 [PATCH] dcache: fix d_splice_alias handling of aliases J. Bruce Fields 2014-01-15 17:34 ` Miklos Szeredi 2014-01-15 17:57 ` J. Bruce Fields [not found] ` <20140115175723.GA4596-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> 2014-01-15 18:25 ` Miklos Szeredi 2014-01-16 15:41 ` J. Bruce Fields [not found] ` <20140116154132.GB16829-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> 2014-01-16 16:13 ` Miklos Szeredi [not found] ` <20140115151749.GF23999-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> 2014-01-16 16:10 ` J. Bruce Fields 2014-01-16 16:15 ` Steven Whitehouse 2014-01-16 16:44 ` J. Bruce Fields 2014-01-16 16:54 ` Bob Peterson 2014-01-16 18:51 ` J. Bruce Fields 2014-01-17 10:04 ` Steven Whitehouse 2014-01-17 18:04 ` J. Bruce Fields 2014-01-17 12:17 ` Christoph Hellwig [not found] ` <20140117121723.GA18375-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> 2014-01-17 15:39 ` J. Bruce Fields [not found] ` <20140117153917.GA26636-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> 2014-01-17 21:03 ` J. Bruce Fields [not found] ` <20140117210343.GD26636-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> 2014-01-17 21:26 ` J. Bruce Fields 2014-01-23 21:27 ` [PATCH] dcache: make d_splice_alias use d_materialise_unique J. Bruce Fields 2014-01-31 18:42 ` Al Viro 2014-01-31 19:47 ` J. Bruce Fields [not found] ` <20140131194758.GA24618-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> 2014-02-06 17:03 ` J. Bruce Fields
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).