From: John McCutchan <ttb@tentacle.dhs.org>
To: Al Viro <viro@ftp.linux.org.uk>
Cc: Linus Torvalds <torvalds@osdl.org>, Ray Lee <ray@madrabbit.org>,
Andrew Morton <akpm@osdl.org>,
Linux Kernel <linux-kernel@vger.kernel.org>,
Robert Love <rml@novell.com>, Al Viro <viro@ZenIV.linux.org.uk>
Subject: Re: [patch] stop inotify from sending random DELETE_SELF event under load
Date: Tue, 20 Sep 2005 21:41:47 -0400 [thread overview]
Message-ID: <1127266907.3950.5.camel@vertex> (raw)
In-Reply-To: <20050921010154.GR7992@ftp.linux.org.uk>
On Wed, 2005-09-21 at 02:01 +0100, Al Viro wrote:
> On Tue, Sep 20, 2005 at 06:53:34PM -0400, John McCutchan wrote:
> > Is there some reason we can't just do this from vfs_unlink
> >
> > inode = dentry->inode;
> > iget (inode);
> > d_delete (dentry);
> > fsnotify_inoderemove (inode);
> > iput (inode);
> >
> > This would allow us to have immediate event notification, and avoid a
> > race with the inode going away, right?
>
> Playing with references to struct inode means playing dirty tricks
> behind the filesystem's back. Doing that in a way that really changes
> inode lifetime means asking for trouble. Combined with a dirty trick
> *already* pulled by sys_unlink() to postpone the final iput until after
> we unlock the parent, it means breakage (and aforementioned dirty trick
> took some rather interesting logics to compensate for in the first place).
>
> Moreover, your suggestion would do that to _everyone_, whether they use
> inotify or not. NAK.
Got it.
>
> > static inline void fsnotify_inoderemove(struct inode *inode)
> > {
> > - inotify_inode_queue_event(inode, IN_DELETE_SELF, 0, NULL);
> > - inotify_inode_is_dead(inode);
> > + inotify_inode_queue_event(inode, IN_DELETE_SELF, inode->i_nlink, NULL);
> > + if (inode->i_nlink == 0)
> > + inotify_inode_is_dead(inode);
> > }
>
> Assumes that filesystem treats ->i_nlink on final iput() in usual way.
> It doesn't have to.
>
I grepped all the filesystems, and they all seem to use
generic_drop_inode, except for hugetlbfs, which seems to have the same
logic of (!inode->i_nlink).
> BTW, what happens if one uses inotify on procfs? Or sysfs, for that matter?
> Fundamental problem with that sucker is that you are playing games with
> lifetime rules of inodes in a way that might be OK for some filesystems,
> but violates a lot of assumptions made by other...
>
Honestly, I don't know. And I don't think I know enough to say with any
certainty how either of them would work. Would a black list of
filesystems that don't want inotify on them be acceptable?
> BTW^2, what guarantees that inotify_unmount_inodes() will not happen while we
> are in inotify_release()? That would happily keep watch refcount bumped,
> so it would outlive inotify_unmount_inodes(). Sure, it would be dropped.
> And call iput() on a pinned inode that had outlived the umount(). Oops...
Good catch,
Index: linux/fs/inotify.c
===================================================================
--- linux.orig/fs/inotify.c 2005-08-31 15:41:11.000000000 -0400
+++ linux/fs/inotify.c 2005-09-20 21:18:35.000000000 -0400
@@ -756,6 +756,7 @@
* do not know the inode until we iterate to the watch. But we need to
* hold inode->inotify_sem before dev->sem. The following works.
*/
+ down(&iprune_sem);
while (1) {
struct inotify_watch *watch;
struct list_head *watches;
@@ -779,6 +780,7 @@
up(&inode->inotify_sem);
put_inotify_watch(watch);
}
+ up(&iprune_sem);
/* destroy all of the events on this device */
down(&dev->sem);
next prev parent reply other threads:[~2005-09-21 1:40 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-09-20 0:48 [patch] stop inotify from sending random DELETE_SELF event under load John McCutchan
2005-09-20 1:37 ` Linus Torvalds
2005-09-20 2:00 ` John McCutchan
2005-09-20 2:20 ` Linus Torvalds
2005-09-20 3:46 ` John McCutchan
2005-09-20 4:03 ` Linus Torvalds
2005-09-20 4:24 ` Al Viro
2005-09-20 4:30 ` Linus Torvalds
2005-09-20 4:36 ` John McCutchan
2005-09-20 4:46 ` Al Viro
2005-09-20 4:53 ` John McCutchan
2005-09-20 4:58 ` Al Viro
2005-09-20 5:06 ` John McCutchan
2005-09-20 5:17 ` Al Viro
2005-09-20 12:34 ` John McCutchan
2005-09-20 16:38 ` Al Viro
2005-09-20 17:44 ` Ray Lee
2005-09-20 18:12 ` Linus Torvalds
2005-09-20 18:22 ` Al Viro
2005-09-20 19:37 ` Linus Torvalds
2005-09-20 22:53 ` John McCutchan
2005-09-21 0:33 ` Linus Torvalds
2005-09-21 0:52 ` John McCutchan
2005-09-21 1:01 ` Al Viro
2005-09-21 1:41 ` John McCutchan [this message]
2005-09-21 2:36 ` Al Viro
2005-09-21 8:35 ` Christoph Hellwig
2005-09-21 4:15 ` [Ocfs2-devel] " Joel Becker
2005-09-21 9:15 ` Joel Becker
2005-09-21 9:17 ` Christoph Hellwig
2005-09-21 9:45 ` [Ocfs2-devel] " Joel Becker
2005-09-21 14:45 ` Joel Becker
2005-09-21 13:08 ` [Ocfs2-devel] " Mark Fasheh
2005-09-21 18:08 ` Mark Fasheh
2005-09-20 18:26 ` John McCutchan
2005-09-20 19:39 ` Linus Torvalds
2005-09-20 4:56 ` Linus Torvalds
2005-09-20 4:52 ` Linus Torvalds
2005-09-20 4:27 ` John McCutchan
2005-09-20 3:33 ` Al Viro
2005-09-20 3:50 ` John McCutchan
2005-09-20 3:31 ` Al Viro
2005-09-20 3:51 ` John McCutchan
2005-09-20 8:33 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1127266907.3950.5.camel@vertex \
--to=ttb@tentacle.dhs.org \
--cc=akpm@osdl.org \
--cc=linux-kernel@vger.kernel.org \
--cc=ray@madrabbit.org \
--cc=rml@novell.com \
--cc=torvalds@osdl.org \
--cc=viro@ZenIV.linux.org.uk \
--cc=viro@ftp.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.