linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Dave Chinner <david@fromorbit.com>
Cc: "Jan Kara" <jack@suse.cz>,
	"Christoph Hellwig" <hch@infradead.org>,
	linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org,
	linux-bcachefs@vger.kernel.org, kent.overstreet@linux.dev,
	torvalds@linux-foundation.org,
	"Mickaël Salaün" <mic@linux.microsoft.com>,
	"Jann Horn" <jannh@google.com>, "Serge Hallyn" <serge@hallyn.com>,
	"Kees Cook" <keescook@chromium.org>,
	linux-security-module@vger.kernel.org,
	"Amir Goldstein" <amir73il@gmail.com>
Subject: Re: lsm sb_delete hook, was Re: [PATCH 4/7] vfs: Convert sb->s_inodes iteration to super_iter_inodes()
Date: Thu, 3 Oct 2024 18:17:31 +0200	[thread overview]
Message-ID: <20241003161731.kwveypqzu4bivesv@quack3> (raw)
In-Reply-To: <Zv6jV40xKIJYuePA@dread.disaster.area>

On Thu 03-10-24 23:59:51, Dave Chinner wrote:
> On Thu, Oct 03, 2024 at 02:56:50PM +0200, Jan Kara wrote:
> > On Thu 03-10-24 05:39:23, Christoph Hellwig wrote:
> > > @@ -789,11 +789,23 @@ static bool dispose_list(struct list_head *head)
> > >   */
> > >  static int evict_inode_fn(struct inode *inode, void *data)
> > >  {
> > > +	struct super_block *sb = inode->i_sb;
> > >  	struct list_head *dispose = data;
> > > +	bool post_unmount = !(sb->s_flags & SB_ACTIVE);
> > >  
> > >  	spin_lock(&inode->i_lock);
> > > -	if (atomic_read(&inode->i_count) ||
> > > -	    (inode->i_state & (I_NEW | I_FREEING | I_WILL_FREE))) {
> > > +	if (atomic_read(&inode->i_count)) {
> > > +		spin_unlock(&inode->i_lock);
> > > +
> > > +		/* for each watch, send FS_UNMOUNT and then remove it */
> > > +		if (post_unmount && fsnotify_sb_info(sb)) {
> > > +			fsnotify_inode(inode, FS_UNMOUNT);
> > > +			fsnotify_inode_delete(inode);
> > > +		}
> > 
> > This will not work because you are in unsafe iterator holding
> > sb->s_inode_list_lock. To be able to call into fsnotify, you need to do the
> > iget / iput dance and releasing of s_inode_list_lock which does not work
> > when a filesystem has its own inodes iterator AFAICT... That's why I've
> > called it a layering violation.
> 
> The whole point of the iget/iput dance is to stabilise the
> s_inodes list iteration whilst it is unlocked - the actual fsnotify
> calls don't need an inode reference to work correctly.
> 
> IOWs, we don't need to run the fsnotify stuff right here - we can
> defer that like we do with the dispose list for all the inodes we
> mark as I_FREEING here.
> 
> So if we pass a structure:
> 
> struct evict_inode_args {
> 	struct list_head	dispose;
> 	struct list_head	fsnotify;
> };
> 
> If we use __iget() instead of requiring an inode state flag to keep
> the inode off the LRU for the fsnotify cleanup, then the code
> fragment above becomes:
> 
> 	if (atomic_read(&inode->i_count)) {
> 		if (post_unmount && fsnotify_sb_info(sb)) {
> 			__iget(inode);
> 			inode_lru_list_del(inode);
> 			spin_unlock(&inode->i_lock);
> 			list_add(&inode->i_lru, &args->fsnotify);
> 		}

Nit: Need to release i_lock in else branch here.  Otherwise interesting
idea. Yes, something like this could work even in unsafe iterator.

> 		return INO_ITER_DONE;
> 	}
> And then once we return to evict_inodes(), we do this:
> 
> 	while (!list_empty(args->fsnotify)) {
> 		struct inode *inode
> 
> 		inode = list_first_entry(head, struct inode, i_lru);
>                 list_del_init(&inode->i_lru);
> 
> 		fsnotify_inode(inode, FS_UNMOUNT);
> 		fsnotify_inode_delete(inode);
> 		iput(inode);
> 		cond_resched();
> 	}
> 
> And so now all the fsnotify cleanup is done outside the traversal in
> one large batch from evict_inodes().

Yup.

> As for the landlock code, I think it needs to have it's own internal
> tracking mechanism and not search the sb inode list for inodes that
> it holds references to. LSM cleanup should be run before before we
> get to tearing down the inode cache, not after....

Well, I think LSM cleanup could in principle be handled together with the
fsnotify cleanup but I didn't check the details.


								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

  reply	other threads:[~2024-10-03 16:17 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-02  1:33 [RFC PATCH 0/7] vfs: improving inode cache iteration scalability Dave Chinner
2024-10-02  1:33 ` [PATCH 1/7] vfs: replace invalidate_inodes() with evict_inodes() Dave Chinner
2024-10-03  7:07   ` Christoph Hellwig
2024-10-03  9:20   ` Jan Kara
2024-10-02  1:33 ` [PATCH 2/7] vfs: add inode iteration superblock method Dave Chinner
2024-10-03  7:12   ` Christoph Hellwig
2024-10-03 10:35     ` Dave Chinner
2024-10-04  9:53   ` kernel test robot
2024-10-02  1:33 ` [PATCH 3/7] vfs: convert vfs inode iterators to super_iter_inodes_unsafe() Dave Chinner
2024-10-03  7:14   ` Christoph Hellwig
2024-10-03 10:45     ` Dave Chinner
2024-10-04 10:55   ` kernel test robot
2024-10-02  1:33 ` [PATCH 4/7] vfs: Convert sb->s_inodes iteration to super_iter_inodes() Dave Chinner
2024-10-03  7:23   ` lsm sb_delete hook, was " Christoph Hellwig
2024-10-03  7:38     ` Christoph Hellwig
2024-10-03 11:57       ` Jan Kara
2024-10-03 12:11         ` Christoph Hellwig
2024-10-03 12:26           ` Jan Kara
2024-10-03 12:39             ` Christoph Hellwig
2024-10-03 12:56               ` Jan Kara
2024-10-03 13:04                 ` Christoph Hellwig
2024-10-03 13:59                 ` Dave Chinner
2024-10-03 16:17                   ` Jan Kara [this message]
2024-10-04  0:46                     ` Dave Chinner
2024-10-04  7:21                       ` Christian Brauner
2024-10-04 12:14                         ` Christoph Hellwig
2024-10-04 13:49                           ` Jan Kara
2024-10-04 18:15                             ` Paul Moore
2024-10-04 22:57                         ` Dave Chinner
2024-10-05 15:21                           ` Mickaël Salaün
2024-10-05 16:03                             ` Mickaël Salaün
2024-10-05 16:03                             ` Paul Moore
2024-10-07 20:37         ` Linus Torvalds
2024-10-07 23:33           ` Dave Chinner
2024-10-08  0:28             ` Linus Torvalds
2024-10-08  0:54               ` Linus Torvalds
2024-10-09  9:49                 ` Jan Kara
2024-10-08 12:59               ` Mickaël Salaün
2024-10-09  0:21                 ` Dave Chinner
2024-10-09  9:23                   ` Mickaël Salaün
2024-10-08  8:57             ` Amir Goldstein
2024-10-08 11:23               ` Jan Kara
2024-10-08 12:16                 ` Christian Brauner
2024-10-09  0:03                   ` Dave Chinner
2024-10-08 23:44                 ` Dave Chinner
2024-10-09  6:10                   ` Amir Goldstein
2024-10-09 14:18                   ` Jan Kara
2024-10-02  1:33 ` [PATCH 5/7] vfs: add inode iteration superblock method Dave Chinner
2024-10-03  7:24   ` Christoph Hellwig
2024-10-02  1:33 ` [PATCH 6/7] xfs: implement sb->iter_vfs_inodes Dave Chinner
2024-10-03  7:30   ` Christoph Hellwig
2024-10-02  1:33 ` [PATCH 7/7] bcachefs: " Dave Chinner
2024-10-02 10:00 ` [RFC PATCH 0/7] vfs: improving inode cache iteration scalability Christian Brauner
2024-10-02 12:34   ` Dave Chinner
2024-10-02 19:29     ` Kent Overstreet
2024-10-02 22:23       ` Dave Chinner
2024-10-02 23:20         ` Kent Overstreet
2024-10-03  1:41           ` Dave Chinner
2024-10-03  2:24             ` Kent Overstreet
2024-10-03  9:17             ` Jan Kara
2024-10-03  9:59               ` Dave Chinner
2024-10-02 19:49     ` Linus Torvalds
2024-10-02 20:28       ` Kent Overstreet
2024-10-02 23:17         ` Dave Chinner
2024-10-03  1:22           ` Kent Overstreet
2024-10-03  2:20             ` Dave Chinner
2024-10-03  2:42               ` Kent Overstreet
2024-10-03 11:45 ` Jan Kara
2024-10-03 12:18   ` Christoph Hellwig
2024-10-03 12:46     ` Jan Kara
2024-10-03 13:35       ` Dave Chinner
2024-10-03 13:03   ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241003161731.kwveypqzu4bivesv@quack3 \
    --to=jack@suse.cz \
    --cc=amir73il@gmail.com \
    --cc=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=jannh@google.com \
    --cc=keescook@chromium.org \
    --cc=kent.overstreet@linux.dev \
    --cc=linux-bcachefs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=mic@linux.microsoft.com \
    --cc=serge@hallyn.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).