From: ebiederm@xmission.com (Eric W. Biederman)
To: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
Dave Jones <davej@redhat.com>,
Linux Kernel <linux-kernel@vger.kernel.org>,
Miklos Szeredi <mszeredi@suse.cz>, Jan Kara <jack@suse.cz>,
Peter Zijlstra <peterz@infradead.org>,
linux-fsdevel@vger.kernel.org,
"J. Bruce Fields" <bfields@redhat.com>,
Sage Weil <sage@newdream.net>
Subject: Re: processes hung after sys_renameat, and 'missing' processes
Date: Thu, 07 Jun 2012 19:08:04 -0700 [thread overview]
Message-ID: <87mx4eimij.fsf@xmission.com> (raw)
In-Reply-To: <20120608003604.GK30000@ZenIV.linux.org.uk> (Al Viro's message of "Fri, 8 Jun 2012 01:36:04 +0100")
Al Viro <viro@ZenIV.linux.org.uk> writes:
> On Thu, Jun 07, 2012 at 04:57:13PM -0700, Linus Torvalds wrote:
>
>> Any per-filesystem mutex should do, so if sysfs always holds the
>> sysfs_mutex - and never allows user-initiated renames - it should be
>> safe.
>
> Frankly, I would very much prefer to have the same locking rules wherever
> possible. The locking system is already overcomplicated and making its
> analysis fs-dependent as well... <shudder> Sure, we can do that, and that
> might even work, until we find out that some piece of code that started
> as a helper to some function never called on sysfs dentries had been
> reused on the path that *is* reachable on sysfs. At which point we are
> suddenly in trouble.
Staring at it I see what I was missing. The practical issue is
lock_rename(), and any parts of the vfs that depend on lock_rename().
d_move and the dcache are made safe just by rename_lock. However other
parts of the vfs that care about using d_ancestor are not. I can't
immediately see a case that really cares but I can't rule such a case
out easily either.
> I wouldn't be bothered so much if the overall picture had been simpler;
> unfortunately, it isn't.
>
> Eric, how about this - if nothing else, that makes code in there simpler
> and less dependent on details of VFS guts:
>
> diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
> index e6bb9b2..5579826 100644
> --- a/fs/sysfs/dir.c
> +++ b/fs/sysfs/dir.c
> @@ -363,7 +363,7 @@ static void sysfs_dentry_iput(struct dentry *dentry, struct inode *inode)
> iput(inode);
> }
>
> -static const struct dentry_operations sysfs_dentry_ops = {
> +const struct dentry_operations sysfs_dentry_ops = {
> .d_revalidate = sysfs_dentry_revalidate,
> .d_delete = sysfs_dentry_delete,
> .d_iput = sysfs_dentry_iput,
> @@ -795,16 +795,8 @@ static struct dentry * sysfs_lookup(struct inode *dir, struct dentry *dentry,
> }
>
> /* instantiate and hash dentry */
> - ret = d_find_alias(inode);
> - if (!ret) {
> - d_set_d_op(dentry, &sysfs_dentry_ops);
> - dentry->d_fsdata = sysfs_get(sd);
> - d_add(dentry, inode);
> - } else {
> - d_move(ret, dentry);
> - iput(inode);
> - }
> -
> + dentry->d_fsdata = sysfs_get(sd);
> + ret = d_materialise_unique(dentry, inode);
I have a small problem with d_materialise_unique. For renames of files
d_materialise_unique calls __d_instantiate_unique. __d_instantiate_unique
does not detect renames of files. Which at least misses the rename
of sysfs symlinks.
Could we put together a d_materialise_unalias for inodes that we know
they always only have one dentry? That I would be happy to use.
I think the reason I would up with my own version was that the dcache
did no provide what I needed and it was just a few lines to code my own.
> diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c
> index 52c3bdb..c15a7a3 100644
> --- a/fs/sysfs/mount.c
> +++ b/fs/sysfs/mount.c
> @@ -68,6 +68,7 @@ static int sysfs_fill_super(struct super_block *sb, void *data, int silent)
> }
> root->d_fsdata = &sysfs_root;
> sb->s_root = root;
> + sb->s_d_op = &sysfs_dentry_ops;
I have no problem with this bit. To answer your earlier question s_d_op
predates this code which is why sysfs was not using it.
> return 0;
> }
>
> diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h
> index 661a963..d73c093 100644
> --- a/fs/sysfs/sysfs.h
> +++ b/fs/sysfs/sysfs.h
> @@ -157,6 +157,7 @@ extern struct kmem_cache *sysfs_dir_cachep;
> */
> extern struct mutex sysfs_mutex;
> extern spinlock_t sysfs_assoc_lock;
> +extern const struct dentry_operations sysfs_dentry_ops;
>
> extern const struct file_operations sysfs_dir_operations;
> extern const struct inode_operations sysfs_dir_inode_operations;
next prev parent reply other threads:[~2012-06-08 2:08 UTC|newest]
Thread overview: 65+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-06-03 22:36 processes hung after sys_renameat, and 'missing' processes Dave Jones
2012-06-03 22:51 ` Dave Jones
2012-06-03 23:07 ` Linus Torvalds
2012-06-03 23:17 ` Al Viro
2012-06-03 23:28 ` Al Viro
2012-06-03 23:40 ` Al Viro
2012-06-03 23:59 ` Al Viro
2012-06-04 0:07 ` Dave Jones
2012-06-06 19:42 ` Dave Jones
2012-06-06 22:38 ` Linus Torvalds
2012-06-06 23:00 ` Dave Jones
2012-06-06 23:31 ` Linus Torvalds
2012-06-06 23:54 ` Al Viro
2012-06-07 0:29 ` Dave Jones
2012-06-07 0:40 ` Al Viro
2012-06-07 0:42 ` Linus Torvalds
2012-06-07 1:19 ` Dave Jones
2012-06-07 1:29 ` Al Viro
2012-06-07 1:31 ` Dave Jones
2012-06-07 1:31 ` Al Viro
2012-06-07 1:42 ` Dave Jones
2012-06-07 1:45 ` Linus Torvalds
2012-06-07 1:54 ` Al Viro
2012-06-07 2:08 ` Dave Jones
2012-06-07 19:36 ` Al Viro
2012-06-07 20:43 ` Sage Weil
2012-06-07 23:12 ` Eric W. Biederman
2012-06-07 23:39 ` Al Viro
2012-06-07 23:57 ` Linus Torvalds
2012-06-08 0:36 ` Al Viro
2012-06-08 0:42 ` Linus Torvalds
2012-06-08 0:59 ` Al Viro
2012-06-08 5:25 ` Eric W. Biederman
2012-06-08 5:48 ` Al Viro
2012-06-08 7:54 ` Eric W. Biederman
2012-06-08 20:20 ` Al Viro
2012-06-08 2:08 ` Eric W. Biederman [this message]
2012-06-08 2:37 ` Al Viro
2012-06-08 2:18 ` Al Viro
2012-06-08 16:22 ` J. Bruce Fields
2012-06-08 17:44 ` Linus Torvalds
2012-06-11 12:17 ` J. Bruce Fields
2012-06-07 1:40 ` Linus Torvalds
2012-06-07 0:35 ` Linus Torvalds
2012-06-07 10:26 ` Peter Zijlstra
2012-06-07 15:30 ` Linus Torvalds
2012-06-08 7:31 ` Peter Zijlstra
2012-06-08 14:38 ` Dave Jones
2012-06-08 14:51 ` Peter Zijlstra
2012-06-08 15:01 ` Dave Jones
2012-06-08 15:11 ` Peter Zijlstra
2012-06-08 15:21 ` Dave Jones
2012-06-08 14:46 ` J. Bruce Fields
2012-06-08 15:08 ` Peter Zijlstra
2012-06-11 12:17 ` J. Bruce Fields
2012-06-04 0:00 ` Dave Jones
2012-06-04 0:16 ` Linus Torvalds
2012-06-04 0:20 ` Al Viro
2012-06-04 9:35 ` Peter Zijlstra
2012-06-04 9:29 ` Peter Zijlstra
2012-06-04 10:49 ` Peter Zijlstra
2012-06-07 0:13 ` Dave Jones
-- strict thread matches above, loose matches on Subject: below --
2012-06-07 7:07 Miklos Szeredi
2012-06-07 15:44 ` Linus Torvalds
2012-06-11 16:02 ` Miklos Szeredi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87mx4eimij.fsf@xmission.com \
--to=ebiederm@xmission.com \
--cc=bfields@redhat.com \
--cc=davej@redhat.com \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mszeredi@suse.cz \
--cc=peterz@infradead.org \
--cc=sage@newdream.net \
--cc=torvalds@linux-foundation.org \
--cc=viro@ZenIV.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.