* lsm sb_delete hook, was Re: [PATCH 4/7] vfs: Convert sb->s_inodes iteration to super_iter_inodes() [not found] ` <20241002014017.3801899-5-david@fromorbit.com> @ 2024-10-03 7:23 ` Christoph Hellwig 2024-10-03 7:38 ` Christoph Hellwig 0 siblings, 1 reply; 34+ messages in thread From: Christoph Hellwig @ 2024-10-03 7:23 UTC (permalink / raw) To: Dave Chinner Cc: linux-fsdevel, linux-xfs, linux-bcachefs, kent.overstreet, torvalds, Mickaël Salaün, Jann Horn, Serge Hallyn, Kees Cook, linux-security-module On Wed, Oct 02, 2024 at 11:33:21AM +1000, Dave Chinner wrote: > --- a/security/landlock/fs.c > +++ b/security/landlock/fs.c > @@ -1223,109 +1223,60 @@ static void hook_inode_free_security_rcu(void *inode_security) > > /* > * Release the inodes used in a security policy. > - * > - * Cf. fsnotify_unmount_inodes() and invalidate_inodes() > */ > +static int release_inode_fn(struct inode *inode, void *data) Looks like this is called from the sb_delete LSM hook, which is only implemented by landlock, and only called from generic_shutdown_super, separated from evict_inodes only by call to fsnotify_sb_delete. Why did LSM not hook into that and instead added another iteration of the per-sb inode list? (Note that this is not tying to get Dave to fix this, just noticed it when reviewing this series) ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: lsm sb_delete hook, was Re: [PATCH 4/7] vfs: Convert sb->s_inodes iteration to super_iter_inodes() 2024-10-03 7:23 ` lsm sb_delete hook, was Re: [PATCH 4/7] vfs: Convert sb->s_inodes iteration to super_iter_inodes() Christoph Hellwig @ 2024-10-03 7:38 ` Christoph Hellwig 2024-10-03 11:57 ` Jan Kara 0 siblings, 1 reply; 34+ messages in thread From: Christoph Hellwig @ 2024-10-03 7:38 UTC (permalink / raw) To: Dave Chinner Cc: linux-fsdevel, linux-xfs, linux-bcachefs, kent.overstreet, torvalds, Mickaël Salaün, Jann Horn, Serge Hallyn, Kees Cook, linux-security-module, Jan Kara, Amir Goldstein On Thu, Oct 03, 2024 at 12:23:41AM -0700, Christoph Hellwig wrote: > On Wed, Oct 02, 2024 at 11:33:21AM +1000, Dave Chinner wrote: > > --- a/security/landlock/fs.c > > +++ b/security/landlock/fs.c > > @@ -1223,109 +1223,60 @@ static void hook_inode_free_security_rcu(void *inode_security) > > > > /* > > * Release the inodes used in a security policy. > > - * > > - * Cf. fsnotify_unmount_inodes() and invalidate_inodes() > > */ > > +static int release_inode_fn(struct inode *inode, void *data) > > Looks like this is called from the sb_delete LSM hook, which > is only implemented by landlock, and only called from > generic_shutdown_super, separated from evict_inodes only by call > to fsnotify_sb_delete. Why did LSM not hook into that and instead An the main thing that fsnotify_sb_delete does is yet another inode iteration.. Ay chance you all could get together an figure out how to get down to a single sb inode iteration per unmount? ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: lsm sb_delete hook, was Re: [PATCH 4/7] vfs: Convert sb->s_inodes iteration to super_iter_inodes() 2024-10-03 7:38 ` Christoph Hellwig @ 2024-10-03 11:57 ` Jan Kara 2024-10-03 12:11 ` Christoph Hellwig 2024-10-07 20:37 ` Linus Torvalds 0 siblings, 2 replies; 34+ messages in thread From: Jan Kara @ 2024-10-03 11:57 UTC (permalink / raw) To: Christoph Hellwig Cc: Dave Chinner, linux-fsdevel, linux-xfs, linux-bcachefs, kent.overstreet, torvalds, Mickaël Salaün, Jann Horn, Serge Hallyn, Kees Cook, linux-security-module, Jan Kara, Amir Goldstein On Thu 03-10-24 00:38:05, Christoph Hellwig wrote: > On Thu, Oct 03, 2024 at 12:23:41AM -0700, Christoph Hellwig wrote: > > On Wed, Oct 02, 2024 at 11:33:21AM +1000, Dave Chinner wrote: > > > --- a/security/landlock/fs.c > > > +++ b/security/landlock/fs.c > > > @@ -1223,109 +1223,60 @@ static void hook_inode_free_security_rcu(void *inode_security) > > > > > > /* > > > * Release the inodes used in a security policy. > > > - * > > > - * Cf. fsnotify_unmount_inodes() and invalidate_inodes() > > > */ > > > +static int release_inode_fn(struct inode *inode, void *data) > > > > Looks like this is called from the sb_delete LSM hook, which > > is only implemented by landlock, and only called from > > generic_shutdown_super, separated from evict_inodes only by call > > to fsnotify_sb_delete. Why did LSM not hook into that and instead > > An the main thing that fsnotify_sb_delete does is yet another inode > iteration.. > > Ay chance you all could get together an figure out how to get down > to a single sb inode iteration per unmount? Fair enough. If we go with the iterator variant I've suggested to Dave in [1], we could combine the evict_inodes(), fsnotify_unmount_inodes() and Landlocks hook_sb_delete() into a single iteration relatively easily. But I'd wait with that convertion until this series lands. Honza [1] https://lore.kernel.org/all/20241003114555.bl34fkqsja4s5tok@quack3 -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: lsm sb_delete hook, was Re: [PATCH 4/7] vfs: Convert sb->s_inodes iteration to super_iter_inodes() 2024-10-03 11:57 ` Jan Kara @ 2024-10-03 12:11 ` Christoph Hellwig 2024-10-03 12:26 ` Jan Kara 2024-10-07 20:37 ` Linus Torvalds 1 sibling, 1 reply; 34+ messages in thread From: Christoph Hellwig @ 2024-10-03 12:11 UTC (permalink / raw) To: Jan Kara Cc: Christoph Hellwig, Dave Chinner, linux-fsdevel, linux-xfs, linux-bcachefs, kent.overstreet, torvalds, Mickaël Salaün, Jann Horn, Serge Hallyn, Kees Cook, linux-security-module, Amir Goldstein On Thu, Oct 03, 2024 at 01:57:21PM +0200, Jan Kara wrote: > Fair enough. If we go with the iterator variant I've suggested to Dave in > [1], we could combine the evict_inodes(), fsnotify_unmount_inodes() and > Landlocks hook_sb_delete() into a single iteration relatively easily. But > I'd wait with that convertion until this series lands. I don't see how that has anything to do with iterators or not. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: lsm sb_delete hook, was Re: [PATCH 4/7] vfs: Convert sb->s_inodes iteration to super_iter_inodes() 2024-10-03 12:11 ` Christoph Hellwig @ 2024-10-03 12:26 ` Jan Kara 2024-10-03 12:39 ` Christoph Hellwig 0 siblings, 1 reply; 34+ messages in thread From: Jan Kara @ 2024-10-03 12:26 UTC (permalink / raw) To: Christoph Hellwig Cc: Jan Kara, Dave Chinner, linux-fsdevel, linux-xfs, linux-bcachefs, kent.overstreet, torvalds, Mickaël Salaün, Jann Horn, Serge Hallyn, Kees Cook, linux-security-module, Amir Goldstein On Thu 03-10-24 05:11:11, Christoph Hellwig wrote: > On Thu, Oct 03, 2024 at 01:57:21PM +0200, Jan Kara wrote: > > Fair enough. If we go with the iterator variant I've suggested to Dave in > > [1], we could combine the evict_inodes(), fsnotify_unmount_inodes() and > > Landlocks hook_sb_delete() into a single iteration relatively easily. But > > I'd wait with that convertion until this series lands. > > I don't see how that has anything to do with iterators or not. Well, the patches would obviously conflict which seems pointless if we could live with three iterations for a few years until somebody noticed :). And with current Dave's version of iterators it will not be possible to integrate evict_inodes() iteration with the other two without a layering violation. Still we could go from 3 to 2 iterations. Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: lsm sb_delete hook, was Re: [PATCH 4/7] vfs: Convert sb->s_inodes iteration to super_iter_inodes() 2024-10-03 12:26 ` Jan Kara @ 2024-10-03 12:39 ` Christoph Hellwig 2024-10-03 12:56 ` Jan Kara 0 siblings, 1 reply; 34+ messages in thread From: Christoph Hellwig @ 2024-10-03 12:39 UTC (permalink / raw) To: Jan Kara Cc: Christoph Hellwig, Dave Chinner, linux-fsdevel, linux-xfs, linux-bcachefs, kent.overstreet, torvalds, Mickaël Salaün, Jann Horn, Serge Hallyn, Kees Cook, linux-security-module, Amir Goldstein On Thu, Oct 03, 2024 at 02:26:57PM +0200, Jan Kara wrote: > On Thu 03-10-24 05:11:11, Christoph Hellwig wrote: > > On Thu, Oct 03, 2024 at 01:57:21PM +0200, Jan Kara wrote: > > > Fair enough. If we go with the iterator variant I've suggested to Dave in > > > [1], we could combine the evict_inodes(), fsnotify_unmount_inodes() and > > > Landlocks hook_sb_delete() into a single iteration relatively easily. But > > > I'd wait with that convertion until this series lands. > > > > I don't see how that has anything to do with iterators or not. > > Well, the patches would obviously conflict Conflict with what? > which seems pointless if we > could live with three iterations for a few years until somebody noticed :). > And with current Dave's version of iterators it will not be possible to > integrate evict_inodes() iteration with the other two without a layering > violation. Still we could go from 3 to 2 iterations. What layering violation? Below is quick compile tested part to do the fsnotify side and get rid of the fsnotify iteration, which looks easily worth it. landlock is a bit more complex because of lsm hooks, and the internal underobj abstraction inside of landlock. But looking at the release inode data vs unomunt synchronization it has and the duplicate code I think doing it this way is worth the effort even more, it'll just need someone who knows landlock and the lsm layering to help with the work. diff --git a/fs/inode.c b/fs/inode.c index 3f335f78c5b228..7d5f8a09e4d29d 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -789,11 +789,23 @@ static bool dispose_list(struct list_head *head) */ static int evict_inode_fn(struct inode *inode, void *data) { + struct super_block *sb = inode->i_sb; struct list_head *dispose = data; + bool post_unmount = !(sb->s_flags & SB_ACTIVE); spin_lock(&inode->i_lock); - if (atomic_read(&inode->i_count) || - (inode->i_state & (I_NEW | I_FREEING | I_WILL_FREE))) { + if (atomic_read(&inode->i_count)) { + spin_unlock(&inode->i_lock); + + /* for each watch, send FS_UNMOUNT and then remove it */ + if (post_unmount && fsnotify_sb_info(sb)) { + fsnotify_inode(inode, FS_UNMOUNT); + fsnotify_inode_delete(inode); + } + return INO_ITER_DONE; + } + + if (inode->i_state & (I_NEW | I_FREEING | I_WILL_FREE)) { spin_unlock(&inode->i_lock); return INO_ITER_DONE; } diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c index 68c34ed9427190..cf89aa69e82c8d 100644 --- a/fs/notify/fsnotify.c +++ b/fs/notify/fsnotify.c @@ -28,16 +28,6 @@ void __fsnotify_vfsmount_delete(struct vfsmount *mnt) fsnotify_clear_marks_by_mount(mnt); } -static int fsnotify_unmount_inode_fn(struct inode *inode, void *data) -{ - spin_unlock(&inode->i_lock); - - /* for each watch, send FS_UNMOUNT and then remove it */ - fsnotify_inode(inode, FS_UNMOUNT); - fsnotify_inode_delete(inode); - return INO_ITER_DONE; -} - void fsnotify_sb_delete(struct super_block *sb) { struct fsnotify_sb_info *sbinfo = fsnotify_sb_info(sb); @@ -46,19 +36,6 @@ void fsnotify_sb_delete(struct super_block *sb) if (!sbinfo) return; - /* - * If i_count is zero, the inode cannot have any watches and - * doing an __iget/iput with SB_ACTIVE clear would actually - * evict all inodes with zero i_count from icache which is - * unnecessarily violent and may in fact be illegal to do. - * However, we should have been called /after/ evict_inodes - * removed all zero refcount inodes, in any case. Hence we use - * INO_ITER_REFERENCED to ensure zero refcount inodes are filtered - * properly. - */ - super_iter_inodes(sb, fsnotify_unmount_inode_fn, NULL, - INO_ITER_REFERENCED); - fsnotify_clear_marks_by_sb(sb); /* Wait for outstanding object references from connectors */ wait_var_event(fsnotify_sb_watched_objects(sb), diff --git a/fs/super.c b/fs/super.c index 971ad4e996e0ba..88dd1703fe73db 100644 --- a/fs/super.c +++ b/fs/super.c @@ -167,28 +167,17 @@ static void super_wake(struct super_block *sb, unsigned int flag) wake_up_var(&sb->s_flags); } -bool super_iter_iget(struct inode *inode, int flags) +bool super_iter_iget(struct inode *inode) { - bool ret = false; + bool ret = false; spin_lock(&inode->i_lock); - if (inode->i_state & (I_NEW | I_FREEING | I_WILL_FREE)) - goto out_unlock; - - /* - * Skip over zero refcount inode if the caller only wants - * referenced inodes to be iterated. - */ - if ((flags & INO_ITER_REFERENCED) && - !atomic_read(&inode->i_count)) - goto out_unlock; - - __iget(inode); - ret = true; -out_unlock: + if (!(inode->i_state & (I_NEW | I_FREEING | I_WILL_FREE))) { + __iget(inode); + ret = true; + } spin_unlock(&inode->i_lock); return ret; - } EXPORT_SYMBOL_GPL(super_iter_iget); @@ -216,7 +205,7 @@ int super_iter_inodes(struct super_block *sb, ino_iter_fn iter_fn, spin_lock(&sb->s_inode_list_lock); list_for_each_entry(inode, &sb->s_inodes, i_sb_list) { - if (!super_iter_iget(inode, flags)) + if (!super_iter_iget(inode)) continue; spin_unlock(&sb->s_inode_list_lock); iput(old_inode); diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c index ee544556cee728..5a174e690424fb 100644 --- a/fs/xfs/xfs_icache.c +++ b/fs/xfs/xfs_icache.c @@ -1654,8 +1654,7 @@ xfs_iter_vfs_igrab( if (ip->i_flags & XFS_ITER_VFS_NOGRAB_IFLAGS) goto out_unlock_noent; - if ((flags & INO_ITER_UNSAFE) || - super_iter_iget(inode, flags)) + if ((flags & INO_ITER_UNSAFE) || super_iter_iget(inode)) ret = true; out_unlock_noent: diff --git a/include/linux/fs.h b/include/linux/fs.h index 2aa335228b84bf..a3c682f0d94c1b 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2224,7 +2224,7 @@ enum freeze_holder { typedef int (*ino_iter_fn)(struct inode *inode, void *priv); int super_iter_inodes(struct super_block *sb, ino_iter_fn iter_fn, void *private_data, int flags); -bool super_iter_iget(struct inode *inode, int flags); +bool super_iter_iget(struct inode *inode); struct super_operations { struct inode *(*alloc_inode)(struct super_block *sb); ^ permalink raw reply related [flat|nested] 34+ messages in thread
* Re: lsm sb_delete hook, was Re: [PATCH 4/7] vfs: Convert sb->s_inodes iteration to super_iter_inodes() 2024-10-03 12:39 ` Christoph Hellwig @ 2024-10-03 12:56 ` Jan Kara 2024-10-03 13:04 ` Christoph Hellwig 2024-10-03 13:59 ` Dave Chinner 0 siblings, 2 replies; 34+ messages in thread From: Jan Kara @ 2024-10-03 12:56 UTC (permalink / raw) To: Christoph Hellwig Cc: Jan Kara, Dave Chinner, linux-fsdevel, linux-xfs, linux-bcachefs, kent.overstreet, torvalds, Mickaël Salaün, Jann Horn, Serge Hallyn, Kees Cook, linux-security-module, Amir Goldstein On Thu 03-10-24 05:39:23, Christoph Hellwig wrote: > On Thu, Oct 03, 2024 at 02:26:57PM +0200, Jan Kara wrote: > > On Thu 03-10-24 05:11:11, Christoph Hellwig wrote: > > > On Thu, Oct 03, 2024 at 01:57:21PM +0200, Jan Kara wrote: > > > > Fair enough. If we go with the iterator variant I've suggested to Dave in > > > > [1], we could combine the evict_inodes(), fsnotify_unmount_inodes() and > > > > Landlocks hook_sb_delete() into a single iteration relatively easily. But > > > > I'd wait with that convertion until this series lands. > > > > > > I don't see how that has anything to do with iterators or not. > > > > Well, the patches would obviously conflict > > Conflict with what? I thought you wanted the interations to be unified in current state of code. If you meant after Dave's series, then we are in agreement. > > which seems pointless if we > > could live with three iterations for a few years until somebody noticed :). > > And with current Dave's version of iterators it will not be possible to > > integrate evict_inodes() iteration with the other two without a layering > > violation. Still we could go from 3 to 2 iterations. > > What layering violation? > > Below is quick compile tested part to do the fsnotify side and > get rid of the fsnotify iteration, which looks easily worth it. ... > @@ -789,11 +789,23 @@ static bool dispose_list(struct list_head *head) > */ > static int evict_inode_fn(struct inode *inode, void *data) > { > + struct super_block *sb = inode->i_sb; > struct list_head *dispose = data; > + bool post_unmount = !(sb->s_flags & SB_ACTIVE); > > spin_lock(&inode->i_lock); > - if (atomic_read(&inode->i_count) || > - (inode->i_state & (I_NEW | I_FREEING | I_WILL_FREE))) { > + if (atomic_read(&inode->i_count)) { > + spin_unlock(&inode->i_lock); > + > + /* for each watch, send FS_UNMOUNT and then remove it */ > + if (post_unmount && fsnotify_sb_info(sb)) { > + fsnotify_inode(inode, FS_UNMOUNT); > + fsnotify_inode_delete(inode); > + } This will not work because you are in unsafe iterator holding sb->s_inode_list_lock. To be able to call into fsnotify, you need to do the iget / iput dance and releasing of s_inode_list_lock which does not work when a filesystem has its own inodes iterator AFAICT... That's why I've called it a layering violation. Honza > + return INO_ITER_DONE; > + } > + > + if (inode->i_state & (I_NEW | I_FREEING | I_WILL_FREE)) { > spin_unlock(&inode->i_lock); > return INO_ITER_DONE; > } -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: lsm sb_delete hook, was Re: [PATCH 4/7] vfs: Convert sb->s_inodes iteration to super_iter_inodes() 2024-10-03 12:56 ` Jan Kara @ 2024-10-03 13:04 ` Christoph Hellwig 2024-10-03 13:59 ` Dave Chinner 1 sibling, 0 replies; 34+ messages in thread From: Christoph Hellwig @ 2024-10-03 13:04 UTC (permalink / raw) To: Jan Kara Cc: Christoph Hellwig, Dave Chinner, linux-fsdevel, linux-xfs, linux-bcachefs, kent.overstreet, torvalds, Mickaël Salaün, Jann Horn, Serge Hallyn, Kees Cook, linux-security-module, Amir Goldstein On Thu, Oct 03, 2024 at 02:56:50PM +0200, Jan Kara wrote: > > + if (atomic_read(&inode->i_count)) { > > + spin_unlock(&inode->i_lock); > > + > > + /* for each watch, send FS_UNMOUNT and then remove it */ > > + if (post_unmount && fsnotify_sb_info(sb)) { > > + fsnotify_inode(inode, FS_UNMOUNT); > > + fsnotify_inode_delete(inode); > > + } > > This will not work because you are in unsafe iterator holding > sb->s_inode_list_lock. To be able to call into fsnotify, you need to do the > iget / iput dance and releasing of s_inode_list_lock which does not work > when a filesystem has its own inodes iterator AFAICT... That's why I've > called it a layering violation. Ah, yes. So we'll need to special case it some way either way. Still feels saner to do it in one iteration and make the inode eviction not use the unsafe version, but maybe that's indeed better postponed until after Dave's series. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: lsm sb_delete hook, was Re: [PATCH 4/7] vfs: Convert sb->s_inodes iteration to super_iter_inodes() 2024-10-03 12:56 ` Jan Kara 2024-10-03 13:04 ` Christoph Hellwig @ 2024-10-03 13:59 ` Dave Chinner 2024-10-03 16:17 ` Jan Kara 1 sibling, 1 reply; 34+ messages in thread From: Dave Chinner @ 2024-10-03 13:59 UTC (permalink / raw) To: Jan Kara Cc: Christoph Hellwig, linux-fsdevel, linux-xfs, linux-bcachefs, kent.overstreet, torvalds, Mickaël Salaün, Jann Horn, Serge Hallyn, Kees Cook, linux-security-module, Amir Goldstein On Thu, Oct 03, 2024 at 02:56:50PM +0200, Jan Kara wrote: > On Thu 03-10-24 05:39:23, Christoph Hellwig wrote: > > @@ -789,11 +789,23 @@ static bool dispose_list(struct list_head *head) > > */ > > static int evict_inode_fn(struct inode *inode, void *data) > > { > > + struct super_block *sb = inode->i_sb; > > struct list_head *dispose = data; > > + bool post_unmount = !(sb->s_flags & SB_ACTIVE); > > > > spin_lock(&inode->i_lock); > > - if (atomic_read(&inode->i_count) || > > - (inode->i_state & (I_NEW | I_FREEING | I_WILL_FREE))) { > > + if (atomic_read(&inode->i_count)) { > > + spin_unlock(&inode->i_lock); > > + > > + /* for each watch, send FS_UNMOUNT and then remove it */ > > + if (post_unmount && fsnotify_sb_info(sb)) { > > + fsnotify_inode(inode, FS_UNMOUNT); > > + fsnotify_inode_delete(inode); > > + } > > This will not work because you are in unsafe iterator holding > sb->s_inode_list_lock. To be able to call into fsnotify, you need to do the > iget / iput dance and releasing of s_inode_list_lock which does not work > when a filesystem has its own inodes iterator AFAICT... That's why I've > called it a layering violation. The whole point of the iget/iput dance is to stabilise the s_inodes list iteration whilst it is unlocked - the actual fsnotify calls don't need an inode reference to work correctly. IOWs, we don't need to run the fsnotify stuff right here - we can defer that like we do with the dispose list for all the inodes we mark as I_FREEING here. So if we pass a structure: struct evict_inode_args { struct list_head dispose; struct list_head fsnotify; }; If we use __iget() instead of requiring an inode state flag to keep the inode off the LRU for the fsnotify cleanup, then the code fragment above becomes: if (atomic_read(&inode->i_count)) { if (post_unmount && fsnotify_sb_info(sb)) { __iget(inode); inode_lru_list_del(inode); spin_unlock(&inode->i_lock); list_add(&inode->i_lru, &args->fsnotify); } return INO_ITER_DONE; } And then once we return to evict_inodes(), we do this: while (!list_empty(args->fsnotify)) { struct inode *inode inode = list_first_entry(head, struct inode, i_lru); list_del_init(&inode->i_lru); fsnotify_inode(inode, FS_UNMOUNT); fsnotify_inode_delete(inode); iput(inode); cond_resched(); } And so now all the fsnotify cleanup is done outside the traversal in one large batch from evict_inodes(). As for the landlock code, I think it needs to have it's own internal tracking mechanism and not search the sb inode list for inodes that it holds references to. LSM cleanup should be run before before we get to tearing down the inode cache, not after.... -Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: lsm sb_delete hook, was Re: [PATCH 4/7] vfs: Convert sb->s_inodes iteration to super_iter_inodes() 2024-10-03 13:59 ` Dave Chinner @ 2024-10-03 16:17 ` Jan Kara 2024-10-04 0:46 ` Dave Chinner 0 siblings, 1 reply; 34+ messages in thread From: Jan Kara @ 2024-10-03 16:17 UTC (permalink / raw) To: Dave Chinner Cc: Jan Kara, Christoph Hellwig, linux-fsdevel, linux-xfs, linux-bcachefs, kent.overstreet, torvalds, Mickaël Salaün, Jann Horn, Serge Hallyn, Kees Cook, linux-security-module, Amir Goldstein On Thu 03-10-24 23:59:51, Dave Chinner wrote: > On Thu, Oct 03, 2024 at 02:56:50PM +0200, Jan Kara wrote: > > On Thu 03-10-24 05:39:23, Christoph Hellwig wrote: > > > @@ -789,11 +789,23 @@ static bool dispose_list(struct list_head *head) > > > */ > > > static int evict_inode_fn(struct inode *inode, void *data) > > > { > > > + struct super_block *sb = inode->i_sb; > > > struct list_head *dispose = data; > > > + bool post_unmount = !(sb->s_flags & SB_ACTIVE); > > > > > > spin_lock(&inode->i_lock); > > > - if (atomic_read(&inode->i_count) || > > > - (inode->i_state & (I_NEW | I_FREEING | I_WILL_FREE))) { > > > + if (atomic_read(&inode->i_count)) { > > > + spin_unlock(&inode->i_lock); > > > + > > > + /* for each watch, send FS_UNMOUNT and then remove it */ > > > + if (post_unmount && fsnotify_sb_info(sb)) { > > > + fsnotify_inode(inode, FS_UNMOUNT); > > > + fsnotify_inode_delete(inode); > > > + } > > > > This will not work because you are in unsafe iterator holding > > sb->s_inode_list_lock. To be able to call into fsnotify, you need to do the > > iget / iput dance and releasing of s_inode_list_lock which does not work > > when a filesystem has its own inodes iterator AFAICT... That's why I've > > called it a layering violation. > > The whole point of the iget/iput dance is to stabilise the > s_inodes list iteration whilst it is unlocked - the actual fsnotify > calls don't need an inode reference to work correctly. > > IOWs, we don't need to run the fsnotify stuff right here - we can > defer that like we do with the dispose list for all the inodes we > mark as I_FREEING here. > > So if we pass a structure: > > struct evict_inode_args { > struct list_head dispose; > struct list_head fsnotify; > }; > > If we use __iget() instead of requiring an inode state flag to keep > the inode off the LRU for the fsnotify cleanup, then the code > fragment above becomes: > > if (atomic_read(&inode->i_count)) { > if (post_unmount && fsnotify_sb_info(sb)) { > __iget(inode); > inode_lru_list_del(inode); > spin_unlock(&inode->i_lock); > list_add(&inode->i_lru, &args->fsnotify); > } Nit: Need to release i_lock in else branch here. Otherwise interesting idea. Yes, something like this could work even in unsafe iterator. > return INO_ITER_DONE; > } > And then once we return to evict_inodes(), we do this: > > while (!list_empty(args->fsnotify)) { > struct inode *inode > > inode = list_first_entry(head, struct inode, i_lru); > list_del_init(&inode->i_lru); > > fsnotify_inode(inode, FS_UNMOUNT); > fsnotify_inode_delete(inode); > iput(inode); > cond_resched(); > } > > And so now all the fsnotify cleanup is done outside the traversal in > one large batch from evict_inodes(). Yup. > As for the landlock code, I think it needs to have it's own internal > tracking mechanism and not search the sb inode list for inodes that > it holds references to. LSM cleanup should be run before before we > get to tearing down the inode cache, not after.... Well, I think LSM cleanup could in principle be handled together with the fsnotify cleanup but I didn't check the details. Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: lsm sb_delete hook, was Re: [PATCH 4/7] vfs: Convert sb->s_inodes iteration to super_iter_inodes() 2024-10-03 16:17 ` Jan Kara @ 2024-10-04 0:46 ` Dave Chinner 2024-10-04 7:21 ` Christian Brauner 0 siblings, 1 reply; 34+ messages in thread From: Dave Chinner @ 2024-10-04 0:46 UTC (permalink / raw) To: Jan Kara Cc: Christoph Hellwig, linux-fsdevel, linux-xfs, linux-bcachefs, kent.overstreet, torvalds, Mickaël Salaün, Jann Horn, Serge Hallyn, Kees Cook, linux-security-module, Amir Goldstein On Thu, Oct 03, 2024 at 06:17:31PM +0200, Jan Kara wrote: > On Thu 03-10-24 23:59:51, Dave Chinner wrote: > > As for the landlock code, I think it needs to have it's own internal > > tracking mechanism and not search the sb inode list for inodes that > > it holds references to. LSM cleanup should be run before before we > > get to tearing down the inode cache, not after.... > > Well, I think LSM cleanup could in principle be handled together with the > fsnotify cleanup but I didn't check the details. I'm not sure how we tell if an inode potentially has a LSM related reference hanging off it. The landlock code looks to make an assumption in that the only referenced inodes it sees will have a valid inode->i_security pointer if landlock is enabled. i.e. it calls landlock_inode(inode) and dereferences the returned value without ever checking if inode->i_security is NULL or not. I mean, we could do a check for inode->i_security when the refcount is elevated and replace the security_sb_delete hook with an security_evict_inode hook similar to the proposed fsnotify eviction from evict_inodes(). But screwing with LSM instructure looks .... obnoxiously complex from the outside... -Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: lsm sb_delete hook, was Re: [PATCH 4/7] vfs: Convert sb->s_inodes iteration to super_iter_inodes() 2024-10-04 0:46 ` Dave Chinner @ 2024-10-04 7:21 ` Christian Brauner 2024-10-04 12:14 ` Christoph Hellwig 2024-10-04 22:57 ` Dave Chinner 0 siblings, 2 replies; 34+ messages in thread From: Christian Brauner @ 2024-10-04 7:21 UTC (permalink / raw) To: Dave Chinner Cc: Jan Kara, Christoph Hellwig, linux-fsdevel, linux-xfs, linux-bcachefs, kent.overstreet, torvalds, Mickaël Salaün, Jann Horn, Serge Hallyn, Kees Cook, linux-security-module, Amir Goldstein On Fri, Oct 04, 2024 at 10:46:27AM GMT, Dave Chinner wrote: > On Thu, Oct 03, 2024 at 06:17:31PM +0200, Jan Kara wrote: > > On Thu 03-10-24 23:59:51, Dave Chinner wrote: > > > As for the landlock code, I think it needs to have it's own internal > > > tracking mechanism and not search the sb inode list for inodes that > > > it holds references to. LSM cleanup should be run before before we > > > get to tearing down the inode cache, not after.... > > > > Well, I think LSM cleanup could in principle be handled together with the > > fsnotify cleanup but I didn't check the details. > > I'm not sure how we tell if an inode potentially has a LSM related > reference hanging off it. The landlock code looks to make an > assumption in that the only referenced inodes it sees will have a > valid inode->i_security pointer if landlock is enabled. i.e. it > calls landlock_inode(inode) and dereferences the returned value > without ever checking if inode->i_security is NULL or not. > > I mean, we could do a check for inode->i_security when the refcount > is elevated and replace the security_sb_delete hook with an > security_evict_inode hook similar to the proposed fsnotify eviction > from evict_inodes(). > > But screwing with LSM instructure looks .... obnoxiously complex > from the outside... Imho, please just focus on the immediate feedback and ignore all the extra bells and whistles that we could or should do. I prefer all of that to be done after this series lands. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: lsm sb_delete hook, was Re: [PATCH 4/7] vfs: Convert sb->s_inodes iteration to super_iter_inodes() 2024-10-04 7:21 ` Christian Brauner @ 2024-10-04 12:14 ` Christoph Hellwig 2024-10-04 13:49 ` Jan Kara 2024-10-04 22:57 ` Dave Chinner 1 sibling, 1 reply; 34+ messages in thread From: Christoph Hellwig @ 2024-10-04 12:14 UTC (permalink / raw) To: Christian Brauner Cc: Dave Chinner, Jan Kara, Christoph Hellwig, linux-fsdevel, linux-xfs, linux-bcachefs, kent.overstreet, torvalds, Mickaël Salaün, Jann Horn, Serge Hallyn, Kees Cook, linux-security-module, Amir Goldstein On Fri, Oct 04, 2024 at 09:21:19AM +0200, Christian Brauner wrote: > > But screwing with LSM instructure looks .... obnoxiously complex > > from the outside... > > Imho, please just focus on the immediate feedback and ignore all the > extra bells and whistles that we could or should do. I prefer all of > that to be done after this series lands. For the LSM mess: absolutely. For fsnotify it seems like Dave has a good idea to integrate it, and it removes the somewhat awkward need for the reffed flag. So if that delayed notify idea works out I'd prefer to see that in over the flag. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: lsm sb_delete hook, was Re: [PATCH 4/7] vfs: Convert sb->s_inodes iteration to super_iter_inodes() 2024-10-04 12:14 ` Christoph Hellwig @ 2024-10-04 13:49 ` Jan Kara 2024-10-04 18:15 ` Paul Moore 0 siblings, 1 reply; 34+ messages in thread From: Jan Kara @ 2024-10-04 13:49 UTC (permalink / raw) To: Christoph Hellwig Cc: Christian Brauner, Dave Chinner, Jan Kara, linux-fsdevel, linux-xfs, linux-bcachefs, kent.overstreet, torvalds, Mickaël Salaün, Jann Horn, Serge Hallyn, Kees Cook, linux-security-module, Amir Goldstein On Fri 04-10-24 05:14:36, Christoph Hellwig wrote: > On Fri, Oct 04, 2024 at 09:21:19AM +0200, Christian Brauner wrote: > > > But screwing with LSM instructure looks .... obnoxiously complex > > > from the outside... > > > > Imho, please just focus on the immediate feedback and ignore all the > > extra bells and whistles that we could or should do. I prefer all of > > that to be done after this series lands. > > For the LSM mess: absolutely. For fsnotify it seems like Dave has > a good idea to integrate it, and it removes the somewhat awkward > need for the reffed flag. So if that delayed notify idea works out > I'd prefer to see that in over the flag. As I wrote in one of the emails in this (now huge) thread, I'm fine with completely dropping that inode->i_refcount check from the fsnotify_unmount_inodes(). It made sense when it was called before evict_inodes() but after 1edc8eb2e931 ("fs: call fsnotify_sb_delete after evict_inodes") the usefulness of this check is rather doubtful. So we can drop the awkward flag regardless whether we unify evict_inodes() with fsnotify_unmount_inodes() or not. I have no strong preference whether the unification happens as part of this patch set or later on so it's up to Dave as far as I'm concerned. Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: lsm sb_delete hook, was Re: [PATCH 4/7] vfs: Convert sb->s_inodes iteration to super_iter_inodes() 2024-10-04 13:49 ` Jan Kara @ 2024-10-04 18:15 ` Paul Moore 0 siblings, 0 replies; 34+ messages in thread From: Paul Moore @ 2024-10-04 18:15 UTC (permalink / raw) To: Mickaël Salaün, Günther Noack Cc: Jan Kara, Christoph Hellwig, Christian Brauner, Dave Chinner, linux-fsdevel, linux-xfs, linux-bcachefs, kent.overstreet, torvalds, Mickaël Salaün, Jann Horn, Serge Hallyn, Kees Cook, linux-security-module, Amir Goldstein On Fri, Oct 4, 2024 at 9:49 AM Jan Kara <jack@suse.cz> wrote: > On Fri 04-10-24 05:14:36, Christoph Hellwig wrote: > > On Fri, Oct 04, 2024 at 09:21:19AM +0200, Christian Brauner wrote: > > > > But screwing with LSM instructure looks .... obnoxiously complex > > > > from the outside... > > > > > > Imho, please just focus on the immediate feedback and ignore all the > > > extra bells and whistles that we could or should do. I prefer all of > > > that to be done after this series lands. > > > > For the LSM mess: absolutely. For fsnotify it seems like Dave has > > a good idea to integrate it, and it removes the somewhat awkward > > need for the reffed flag. So if that delayed notify idea works out > > I'd prefer to see that in over the flag. > > As I wrote in one of the emails in this (now huge) thread, I'm fine with > completely dropping that inode->i_refcount check from the > fsnotify_unmount_inodes(). It made sense when it was called before > evict_inodes() but after 1edc8eb2e931 ("fs: call fsnotify_sb_delete after > evict_inodes") the usefulness of this check is rather doubtful. So we can > drop the awkward flag regardless whether we unify evict_inodes() with > fsnotify_unmount_inodes() or not. I have no strong preference whether the > unification happens as part of this patch set or later on so it's up to > Dave as far as I'm concerned. I didn't get a chance to look at this thread until just now and I'm noticing that the email used for Mickaël is likely not the best, I'm adding the email he uses in MAINTAINERS as well as that of Günther Noack, a designated Landlock reviewer. Mickaël, Günther, the lore link for the full discussion is below: https://lore.kernel.org/all/Zv5GfY1WS_aaczZM@infradead.org -- paul-moore.com ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: lsm sb_delete hook, was Re: [PATCH 4/7] vfs: Convert sb->s_inodes iteration to super_iter_inodes() 2024-10-04 7:21 ` Christian Brauner 2024-10-04 12:14 ` Christoph Hellwig @ 2024-10-04 22:57 ` Dave Chinner 2024-10-05 15:21 ` Mickaël Salaün 1 sibling, 1 reply; 34+ messages in thread From: Dave Chinner @ 2024-10-04 22:57 UTC (permalink / raw) To: Christian Brauner Cc: Jan Kara, Christoph Hellwig, linux-fsdevel, linux-xfs, linux-bcachefs, kent.overstreet, torvalds, Mickaël Salaün, Jann Horn, Serge Hallyn, Kees Cook, linux-security-module, Amir Goldstein On Fri, Oct 04, 2024 at 09:21:19AM +0200, Christian Brauner wrote: > On Fri, Oct 04, 2024 at 10:46:27AM GMT, Dave Chinner wrote: > > On Thu, Oct 03, 2024 at 06:17:31PM +0200, Jan Kara wrote: > > > On Thu 03-10-24 23:59:51, Dave Chinner wrote: > > > > As for the landlock code, I think it needs to have it's own internal > > > > tracking mechanism and not search the sb inode list for inodes that > > > > it holds references to. LSM cleanup should be run before before we > > > > get to tearing down the inode cache, not after.... > > > > > > Well, I think LSM cleanup could in principle be handled together with the > > > fsnotify cleanup but I didn't check the details. > > > > I'm not sure how we tell if an inode potentially has a LSM related > > reference hanging off it. The landlock code looks to make an > > assumption in that the only referenced inodes it sees will have a > > valid inode->i_security pointer if landlock is enabled. i.e. it > > calls landlock_inode(inode) and dereferences the returned value > > without ever checking if inode->i_security is NULL or not. > > > > I mean, we could do a check for inode->i_security when the refcount > > is elevated and replace the security_sb_delete hook with an > > security_evict_inode hook similar to the proposed fsnotify eviction > > from evict_inodes(). > > > > But screwing with LSM instructure looks .... obnoxiously complex > > from the outside... > > Imho, please just focus on the immediate feedback and ignore all the > extra bells and whistles that we could or should do. I prefer all of > that to be done after this series lands. Actually, it's not as bad as I thought it was going to be. I've already moved both fsnotify and LSM inode eviction to evict_inodes() as preparatory patches... Dave Chinner (2): vfs: move fsnotify inode eviction to evict_inodes() vfs, lsm: rework lsm inode eviction at unmount fs/inode.c | 52 +++++++++++++--- fs/notify/fsnotify.c | 60 ------------------- fs/super.c | 8 +-- include/linux/lsm_hook_defs.h | 2 +- include/linux/security.h | 2 +- security/landlock/fs.c | 134 ++++++++++-------------------------------- security/security.c | 31 ++++++---- 7 files changed, 99 insertions(+), 190 deletions(-) -Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: lsm sb_delete hook, was Re: [PATCH 4/7] vfs: Convert sb->s_inodes iteration to super_iter_inodes() 2024-10-04 22:57 ` Dave Chinner @ 2024-10-05 15:21 ` Mickaël Salaün 2024-10-05 16:03 ` Mickaël Salaün 2024-10-05 16:03 ` Paul Moore 0 siblings, 2 replies; 34+ messages in thread From: Mickaël Salaün @ 2024-10-05 15:21 UTC (permalink / raw) To: Dave Chinner Cc: Christian Brauner, Jan Kara, Christoph Hellwig, linux-fsdevel, linux-xfs, linux-bcachefs, kent.overstreet, torvalds, Jann Horn, Serge Hallyn, Kees Cook, linux-security-module, Amir Goldstein, Paul Moore, Günther Noack On Sat, Oct 05, 2024 at 08:57:32AM +1000, Dave Chinner wrote: > On Fri, Oct 04, 2024 at 09:21:19AM +0200, Christian Brauner wrote: > > On Fri, Oct 04, 2024 at 10:46:27AM GMT, Dave Chinner wrote: > > > On Thu, Oct 03, 2024 at 06:17:31PM +0200, Jan Kara wrote: > > > > On Thu 03-10-24 23:59:51, Dave Chinner wrote: > > > > > As for the landlock code, I think it needs to have it's own internal > > > > > tracking mechanism and not search the sb inode list for inodes that > > > > > it holds references to. LSM cleanup should be run before before we > > > > > get to tearing down the inode cache, not after.... > > > > > > > > Well, I think LSM cleanup could in principle be handled together with the > > > > fsnotify cleanup but I didn't check the details. > > > > > > I'm not sure how we tell if an inode potentially has a LSM related > > > reference hanging off it. The landlock code looks to make an > > > assumption in that the only referenced inodes it sees will have a > > > valid inode->i_security pointer if landlock is enabled. i.e. it > > > calls landlock_inode(inode) and dereferences the returned value > > > without ever checking if inode->i_security is NULL or not. Correct, i_security should always be valid when this hook is called because it means that at least Landlock is enabled and then i_security refers to a valid LSM blob. > > > > > > I mean, we could do a check for inode->i_security when the refcount > > > is elevated and replace the security_sb_delete hook with an > > > security_evict_inode hook similar to the proposed fsnotify eviction > > > from evict_inodes(). That would be nice. > > > > > > But screwing with LSM instructure looks .... obnoxiously complex > > > from the outside... > > > > Imho, please just focus on the immediate feedback and ignore all the > > extra bells and whistles that we could or should do. I prefer all of > > that to be done after this series lands. > > Actually, it's not as bad as I thought it was going to be. I've > already moved both fsnotify and LSM inode eviction to > evict_inodes() as preparatory patches... Good, please Cc me and Günther on related patch series. FYI, we have the two release_inodes tests to check this hook in tools/testing/selftests/landlock/fs_test.c > > Dave Chinner (2): > vfs: move fsnotify inode eviction to evict_inodes() > vfs, lsm: rework lsm inode eviction at unmount > > fs/inode.c | 52 +++++++++++++--- > fs/notify/fsnotify.c | 60 ------------------- > fs/super.c | 8 +-- > include/linux/lsm_hook_defs.h | 2 +- > include/linux/security.h | 2 +- > security/landlock/fs.c | 134 ++++++++++-------------------------------- > security/security.c | 31 ++++++---- > 7 files changed, 99 insertions(+), 190 deletions(-) > > -Dave. > -- > Dave Chinner > david@fromorbit.com > ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: lsm sb_delete hook, was Re: [PATCH 4/7] vfs: Convert sb->s_inodes iteration to super_iter_inodes() 2024-10-05 15:21 ` Mickaël Salaün @ 2024-10-05 16:03 ` Mickaël Salaün 2024-10-05 16:03 ` Paul Moore 1 sibling, 0 replies; 34+ messages in thread From: Mickaël Salaün @ 2024-10-05 16:03 UTC (permalink / raw) To: Dave Chinner Cc: Christian Brauner, Jan Kara, Christoph Hellwig, linux-fsdevel, linux-xfs, linux-bcachefs, kent.overstreet, torvalds, Jann Horn, Serge Hallyn, Kees Cook, linux-security-module, Amir Goldstein, Paul Moore, Günther Noack On Sat, Oct 05, 2024 at 05:21:30PM +0200, Mickaël Salaün wrote: > On Sat, Oct 05, 2024 at 08:57:32AM +1000, Dave Chinner wrote: > > Actually, it's not as bad as I thought it was going to be. I've > > already moved both fsnotify and LSM inode eviction to > > evict_inodes() as preparatory patches... > > Good, please Cc me and Günther on related patch series. > > FYI, we have the two release_inodes tests to check this hook in > tools/testing/selftests/landlock/fs_test.c > > > > > Dave Chinner (2): > > vfs: move fsnotify inode eviction to evict_inodes() > > vfs, lsm: rework lsm inode eviction at unmount > > > > fs/inode.c | 52 +++++++++++++--- > > fs/notify/fsnotify.c | 60 ------------------- > > fs/super.c | 8 +-- > > include/linux/lsm_hook_defs.h | 2 +- > > include/linux/security.h | 2 +- > > security/landlock/fs.c | 134 ++++++++++-------------------------------- Please run clang-format -i security/landlock/fs.c > > security/security.c | 31 ++++++---- > > 7 files changed, 99 insertions(+), 190 deletions(-) > > > > -Dave. > > -- > > Dave Chinner > > david@fromorbit.com > > ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: lsm sb_delete hook, was Re: [PATCH 4/7] vfs: Convert sb->s_inodes iteration to super_iter_inodes() 2024-10-05 15:21 ` Mickaël Salaün 2024-10-05 16:03 ` Mickaël Salaün @ 2024-10-05 16:03 ` Paul Moore 1 sibling, 0 replies; 34+ messages in thread From: Paul Moore @ 2024-10-05 16:03 UTC (permalink / raw) To: Mickaël Salaün Cc: Dave Chinner, Christian Brauner, Jan Kara, Christoph Hellwig, linux-fsdevel, linux-xfs, linux-bcachefs, kent.overstreet, torvalds, Jann Horn, Serge Hallyn, Kees Cook, linux-security-module, Amir Goldstein, Günther Noack On Sat, Oct 5, 2024 at 11:21 AM Mickaël Salaün <mic@digikod.net> wrote: > On Sat, Oct 05, 2024 at 08:57:32AM +1000, Dave Chinner wrote: ... > > Actually, it's not as bad as I thought it was going to be. I've > > already moved both fsnotify and LSM inode eviction to > > evict_inodes() as preparatory patches... > > Good, please Cc me and Günther on related patch series. As well as the LSM list since the LSM framework looks to have some changes as well. -- paul-moore.com ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: lsm sb_delete hook, was Re: [PATCH 4/7] vfs: Convert sb->s_inodes iteration to super_iter_inodes() 2024-10-03 11:57 ` Jan Kara 2024-10-03 12:11 ` Christoph Hellwig @ 2024-10-07 20:37 ` Linus Torvalds 2024-10-07 23:33 ` Dave Chinner 1 sibling, 1 reply; 34+ messages in thread From: Linus Torvalds @ 2024-10-07 20:37 UTC (permalink / raw) To: Jan Kara Cc: Christoph Hellwig, Dave Chinner, linux-fsdevel, linux-xfs, linux-bcachefs, kent.overstreet, Mickaël Salaün, Jann Horn, Serge Hallyn, Kees Cook, linux-security-module, Amir Goldstein On Thu, 3 Oct 2024 at 04:57, Jan Kara <jack@suse.cz> wrote: > > Fair enough. If we go with the iterator variant I've suggested to Dave in > [1], we could combine the evict_inodes(), fsnotify_unmount_inodes() and > Landlocks hook_sb_delete() into a single iteration relatively easily. But > I'd wait with that convertion until this series lands. Honza, I looked at this a bit more, particularly with an eye of "what happens if we just end up making the inode lifetimes subject to the dentry lifetimes" as suggested by Dave elsewhere. And honestly, the whole inode list use by the fsnotify layer seems to kind of suck. But I may be entirely missing something, so maybe I'm very wrong for some reason. The reason I say it "seems to kind of suck" is that the whole final /* for each watch, send FS_UNMOUNT and then remove it */ fsnotify_inode(inode, FS_UNMOUNT); fsnotify_inode_delete(inode); sequence seems to be entirely timing-dependent, and largely pointless and wrong. Why? Because inodes with no users will get removed at completely arbitrary times under memory pressure in evict() -> destroy_inode(), and obviously with I_DONTCACHE that ends up happening even earlier when the dentry is removed. So the whole "send FS_UNMOUNT and then remove it " thing seems to be entirely bogus, and depending on memory pressure, lots of inodes will only see the fsnotify_inode_delete() at eviction time and never get the FS_UNMOUNT notification anyway. So I get the feeling that we'd be better off entirely removing the sb->s_inodes use from fsnotify, and replace this "get rid of them at umount" with something like this instead: diff --git a/fs/dcache.c b/fs/dcache.c index 0f6b16ba30d0..aa2558de8d1f 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -406,6 +406,7 @@ static void dentry_unlink_inode(struct dentry * dentry) spin_unlock(&inode->i_lock); if (!inode->i_nlink) fsnotify_inoderemove(inode); + fsnotify_inode_delete(inode); if (dentry->d_op && dentry->d_op->d_iput) dentry->d_op->d_iput(dentry, inode); else diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h index 278620e063ab..ea91cc216028 100644 --- a/include/linux/fsnotify.h +++ b/include/linux/fsnotify.h @@ -261,7 +261,6 @@ static inline void fsnotify_vfsmount_delete(struct vfsmount *mnt) static inline void fsnotify_inoderemove(struct inode *inode) { fsnotify_inode(inode, FS_DELETE_SELF); - __fsnotify_inode_delete(inode); } /* which makes the fsnotify_inode_delete() happen when the inode is removed from the dentry. Then at umount time, the dentry shrinking will deal with all live dentries, and at most the fsnotify layer would send the FS_UNMOUNT to just the root dentry inodes? Wouldn't that make things much cleaner, and remove at least *one* odd use of the nasty s_inodes list? I have this feeling that maybe we can just remove the other users too using similar models. I think the LSM layer use (in landlock) is bogus for exactly the same reason - there's really no reason to keep things around for a random cached inode without a dentry. And I wonder if the quota code (which uses the s_inodes list to enable quotas on already mounted filesystems) could for all the same reasons just walk the dentry tree instead (and remove_dquot_ref similarly could just remove it at dentry_unlink_inode() time)? It really feels like most (all?) of the s_inode list users are basically historical, and shouldn't use that list at all. And there aren't _that_ many of them. I think Dave was right in just saying that this list should go away entirely (or was it somebody else who made that comment?) Linus ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: lsm sb_delete hook, was Re: [PATCH 4/7] vfs: Convert sb->s_inodes iteration to super_iter_inodes() 2024-10-07 20:37 ` Linus Torvalds @ 2024-10-07 23:33 ` Dave Chinner 2024-10-08 0:28 ` Linus Torvalds 2024-10-08 8:57 ` Amir Goldstein 0 siblings, 2 replies; 34+ messages in thread From: Dave Chinner @ 2024-10-07 23:33 UTC (permalink / raw) To: Linus Torvalds Cc: Jan Kara, Christoph Hellwig, linux-fsdevel, linux-xfs, linux-bcachefs, kent.overstreet, Mickaël Salaün, Jann Horn, Serge Hallyn, Kees Cook, linux-security-module, Amir Goldstein On Mon, Oct 07, 2024 at 01:37:19PM -0700, Linus Torvalds wrote: > On Thu, 3 Oct 2024 at 04:57, Jan Kara <jack@suse.cz> wrote: > > > > Fair enough. If we go with the iterator variant I've suggested to Dave in > > [1], we could combine the evict_inodes(), fsnotify_unmount_inodes() and > > Landlocks hook_sb_delete() into a single iteration relatively easily. But > > I'd wait with that convertion until this series lands. > > Honza, I looked at this a bit more, particularly with an eye of "what > happens if we just end up making the inode lifetimes subject to the > dentry lifetimes" as suggested by Dave elsewhere. .... > which makes the fsnotify_inode_delete() happen when the inode is > removed from the dentry. There may be other inode references being held that make the inode live longer than the dentry cache. When should the fsnotify marks be removed from the inode in that case? Do they need to remain until, e.g, writeback completes? > Then at umount time, the dentry shrinking will deal with all live > dentries, and at most the fsnotify layer would send the FS_UNMOUNT to > just the root dentry inodes? I don't think even that is necessary, because shrink_dcache_for_umount() drops the sb->s_root dentry after trimming the dentry tree. Hence the dcache drop would cleanup all inode references, roots included. > Wouldn't that make things much cleaner, and remove at least *one* odd > use of the nasty s_inodes list? Yes, it would, but someone who knows exactly when the fsnotify marks can be removed needs to chime in here... > I have this feeling that maybe we can just remove the other users too > using similar models. I think the LSM layer use (in landlock) is bogus > for exactly the same reason - there's really no reason to keep things > around for a random cached inode without a dentry. Perhaps, but I'm not sure what the landlock code is actually trying to do. It seems to be trying to avoid races between syscalls releasing inode references and unmount calling security_sb_delete() to clean up inode references that it has leaked. This implies that it's not a) tracking inodes itself, and b) not cleaning up internal state early enough in unmount. Hence, to me, the lifecycle and reference counting of inode related objects in landlock doesn't seem quite right, and the use of the security_sb_delete() callout appears to be papering over an internal lifecycle issue. I'd love to get rid of it altogether. > And I wonder if the quota code (which uses the s_inodes list to enable > quotas on already mounted filesystems) could for all the same reasons > just walk the dentry tree instead (and remove_dquot_ref similarly > could just remove it at dentry_unlink_inode() time)? I don't think that will work because we have to be able to modify quota in evict() processing. This is especially true for unlinked inodes being evicted from cache, but also the dquots need to stay attached until writeback completes. Hence I don't think we can remove the quota refs from the inode before we call iput_final(), and so I think quotaoff (at least) still needs to iterate inodes... > It really feels like most (all?) of the s_inode list users are > basically historical, and shouldn't use that list at all. And there > aren't _that_ many of them. I think Dave was right in just saying that > this list should go away entirely (or was it somebody else who made > that comment?) Yeah, I said that it should go away entirely. My view of this whole s_inodes list is that subsystems that are taking references to inodes *must* track or manage the references to the inodes themselves. The canonical example is the VFS itself: evict_inodes() doesn't need to iterate s_inodes at all. It can walk the inode LRU to purge all the unreferenced cached inodes from memory. iput_final() guarantees that all unreferenced inodes are either put on the LRU or torn down immediately. Hence I think that it is a poor architectural decision to require superblock teardown to clean up inode references random subsystems have *leaked* to prevent UAFs. It forces the sb to track all inodes whether the VFS actually needs to track them or not. -Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: lsm sb_delete hook, was Re: [PATCH 4/7] vfs: Convert sb->s_inodes iteration to super_iter_inodes() 2024-10-07 23:33 ` Dave Chinner @ 2024-10-08 0:28 ` Linus Torvalds 2024-10-08 0:54 ` Linus Torvalds 2024-10-08 12:59 ` Mickaël Salaün 2024-10-08 8:57 ` Amir Goldstein 1 sibling, 2 replies; 34+ messages in thread From: Linus Torvalds @ 2024-10-08 0:28 UTC (permalink / raw) To: Dave Chinner Cc: Jan Kara, Christoph Hellwig, linux-fsdevel, linux-xfs, linux-bcachefs, kent.overstreet, Mickaël Salaün, Jann Horn, Serge Hallyn, Kees Cook, linux-security-module, Amir Goldstein On Mon, 7 Oct 2024 at 16:33, Dave Chinner <david@fromorbit.com> wrote: > > There may be other inode references being held that make > the inode live longer than the dentry cache. When should the > fsnotify marks be removed from the inode in that case? Do they need > to remain until, e.g, writeback completes? Note that my idea is to just remove the fsnotify marks when the dentry discards the inode. That means that yes, the inode may still have a lifetime after the dentry (because of other references, _or_ just because I_DONTCACHE isn't set and we keep caching the inode). BUT - fsnotify won't care. There won't be any fsnotify marks on that inode any more, and without a dentry that points to it, there's no way to add such marks. (A new dentry may be re-attached to such an inode, and then fsnotify could re-add new marks, but that doesn't change anything - the next time the dentry is detached, the marks would go away again). And yes, this changes the timing on when fsnotify events happen, but what I'm actually hoping for is that Jan will agree that it doesn't actually matter semantically. > > Then at umount time, the dentry shrinking will deal with all live > > dentries, and at most the fsnotify layer would send the FS_UNMOUNT to > > just the root dentry inodes? > > I don't think even that is necessary, because > shrink_dcache_for_umount() drops the sb->s_root dentry after > trimming the dentry tree. Hence the dcache drop would cleanup all > inode references, roots included. Ahh - even better. I didn't actually look very closely at the actual umount path, I was looking just at the fsnotify_inoderemove() place in dentry_unlink_inode() and went "couldn't we do _this_ instead?" > > Wouldn't that make things much cleaner, and remove at least *one* odd > > use of the nasty s_inodes list? > > Yes, it would, but someone who knows exactly when the fsnotify > marks can be removed needs to chime in here... Yup. Honza? (Aside: I don't actually know if you prefer Jan or Honza, so I use both randomly and interchangeably?) > > I have this feeling that maybe we can just remove the other users too > > using similar models. I think the LSM layer use (in landlock) is bogus > > for exactly the same reason - there's really no reason to keep things > > around for a random cached inode without a dentry. > > Perhaps, but I'm not sure what the landlock code is actually trying > to do. Yeah, I wouldn't be surprised if it's just confused - it's very odd. But I'd be perfectly happy just removing one use at a time - even if we keep the s_inodes list around because of other users, it would still be "one less thing". > Hence, to me, the lifecycle and reference counting of inode related > objects in landlock doesn't seem quite right, and the use of the > security_sb_delete() callout appears to be papering over an internal > lifecycle issue. > > I'd love to get rid of it altogether. Yeah, I think the inode lifetime is just so random these days that anything that depends on it is questionable. The quota case is probably the only thing where the inode lifetime *really* makes sense, and that's the one where I looked at the code and went "I *hope* this can be converted to traversing the dentry tree", but at the same time it did look sensible to make it be about inodes. If we can convert the quota side to be based on dentry lifetimes, it will almost certainly then have to react to the places that do "d_add()" when re-connecting an inode to a dentry at lookup time. So yeah, the quota code looks worse, but even if we could just remove fsnotify and landlock, I'd still be much happier. Linus ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: lsm sb_delete hook, was Re: [PATCH 4/7] vfs: Convert sb->s_inodes iteration to super_iter_inodes() 2024-10-08 0:28 ` Linus Torvalds @ 2024-10-08 0:54 ` Linus Torvalds 2024-10-09 9:49 ` Jan Kara 2024-10-08 12:59 ` Mickaël Salaün 1 sibling, 1 reply; 34+ messages in thread From: Linus Torvalds @ 2024-10-08 0:54 UTC (permalink / raw) To: Dave Chinner Cc: Jan Kara, Christoph Hellwig, linux-fsdevel, linux-xfs, linux-bcachefs, kent.overstreet, Mickaël Salaün, Jann Horn, Serge Hallyn, Kees Cook, linux-security-module, Amir Goldstein On Mon, 7 Oct 2024 at 17:28, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > And yes, this changes the timing on when fsnotify events happen, but > what I'm actually hoping for is that Jan will agree that it doesn't > actually matter semantically. .. and yes, I realize it might actually matter. fsnotify does do 'ihold()' to hold an inode ref, and with this that would actually be more or less pointless, because the mark would be removed _despite_ such a ref. So maybe it's not an option to do what I suggested. I don't know the users well enough. Linus ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: lsm sb_delete hook, was Re: [PATCH 4/7] vfs: Convert sb->s_inodes iteration to super_iter_inodes() 2024-10-08 0:54 ` Linus Torvalds @ 2024-10-09 9:49 ` Jan Kara 0 siblings, 0 replies; 34+ messages in thread From: Jan Kara @ 2024-10-09 9:49 UTC (permalink / raw) To: Linus Torvalds Cc: Dave Chinner, Jan Kara, Christoph Hellwig, linux-fsdevel, linux-xfs, linux-bcachefs, kent.overstreet, Mickaël Salaün, Jann Horn, Serge Hallyn, Kees Cook, linux-security-module, Amir Goldstein On Mon 07-10-24 17:54:16, Linus Torvalds wrote: > On Mon, 7 Oct 2024 at 17:28, Linus Torvalds > <torvalds@linux-foundation.org> wrote: > > > > And yes, this changes the timing on when fsnotify events happen, but > > what I'm actually hoping for is that Jan will agree that it doesn't > > actually matter semantically. > > .. and yes, I realize it might actually matter. fsnotify does do > 'ihold()' to hold an inode ref, and with this that would actually be > more or less pointless, because the mark would be removed _despite_ > such a ref. > > So maybe it's not an option to do what I suggested. I don't know the > users well enough. Yeah, we need to keep the notification mark alive either until the inode is deleted or until the filesystem is unmounted to maintain behavior of inotify and fanotify APIs. That being said we could rework lifetime rules inside fsnotify subsystem as Dave suggests so that fsnotify would not pin inodes, detach it's structures from inodes on inode reclaim and associate notification marks with inodes when they are loaded from disk. But it's a relatively big overhaul. Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: lsm sb_delete hook, was Re: [PATCH 4/7] vfs: Convert sb->s_inodes iteration to super_iter_inodes() 2024-10-08 0:28 ` Linus Torvalds 2024-10-08 0:54 ` Linus Torvalds @ 2024-10-08 12:59 ` Mickaël Salaün 2024-10-09 0:21 ` Dave Chinner 1 sibling, 1 reply; 34+ messages in thread From: Mickaël Salaün @ 2024-10-08 12:59 UTC (permalink / raw) To: Linus Torvalds Cc: Dave Chinner, Jan Kara, Christoph Hellwig, linux-fsdevel, linux-xfs, linux-bcachefs, kent.overstreet, Jann Horn, Serge Hallyn, Kees Cook, linux-security-module, Amir Goldstein, Günther Noack On Mon, Oct 07, 2024 at 05:28:57PM -0700, Linus Torvalds wrote: > On Mon, 7 Oct 2024 at 16:33, Dave Chinner <david@fromorbit.com> wrote: > > > > There may be other inode references being held that make > > the inode live longer than the dentry cache. When should the > > fsnotify marks be removed from the inode in that case? Do they need > > to remain until, e.g, writeback completes? > > Note that my idea is to just remove the fsnotify marks when the dentry > discards the inode. > > That means that yes, the inode may still have a lifetime after the > dentry (because of other references, _or_ just because I_DONTCACHE > isn't set and we keep caching the inode). > > BUT - fsnotify won't care. There won't be any fsnotify marks on that > inode any more, and without a dentry that points to it, there's no way > to add such marks. > > (A new dentry may be re-attached to such an inode, and then fsnotify > could re-add new marks, but that doesn't change anything - the next > time the dentry is detached, the marks would go away again). > > And yes, this changes the timing on when fsnotify events happen, but > what I'm actually hoping for is that Jan will agree that it doesn't > actually matter semantically. > > > > Then at umount time, the dentry shrinking will deal with all live > > > dentries, and at most the fsnotify layer would send the FS_UNMOUNT to > > > just the root dentry inodes? > > > > I don't think even that is necessary, because > > shrink_dcache_for_umount() drops the sb->s_root dentry after > > trimming the dentry tree. Hence the dcache drop would cleanup all > > inode references, roots included. > > Ahh - even better. > > I didn't actually look very closely at the actual umount path, I was > looking just at the fsnotify_inoderemove() place in > dentry_unlink_inode() and went "couldn't we do _this_ instead?" > > > > Wouldn't that make things much cleaner, and remove at least *one* odd > > > use of the nasty s_inodes list? > > > > Yes, it would, but someone who knows exactly when the fsnotify > > marks can be removed needs to chime in here... > > Yup. Honza? > > (Aside: I don't actually know if you prefer Jan or Honza, so I use > both randomly and interchangeably?) > > > > I have this feeling that maybe we can just remove the other users too > > > using similar models. I think the LSM layer use (in landlock) is bogus > > > for exactly the same reason - there's really no reason to keep things > > > around for a random cached inode without a dentry. > > > > Perhaps, but I'm not sure what the landlock code is actually trying > > to do. In Landlock, inodes (see landlock_object) may be referenced by several rulesets, either tied to a task's cred or a ruleset's file descriptor. A ruleset may outlive its referenced inodes, and this should not block related umounts. security_sb_delete() is used to gracefully release such references. > > Yeah, I wouldn't be surprised if it's just confused - it's very odd. > > But I'd be perfectly happy just removing one use at a time - even if > we keep the s_inodes list around because of other users, it would > still be "one less thing". > > > Hence, to me, the lifecycle and reference counting of inode related > > objects in landlock doesn't seem quite right, and the use of the > > security_sb_delete() callout appears to be papering over an internal > > lifecycle issue. > > > > I'd love to get rid of it altogether. I'm not sure to fully understand the implications for now, but it would definitely be good to simplify this lifetime management. The only requirement for Landlock is that inodes references should live as long as the related inodes are accessible by user space or already in use. The sooner these references are removed from related ruleset, the better. > > Yeah, I think the inode lifetime is just so random these days that > anything that depends on it is questionable. > > The quota case is probably the only thing where the inode lifetime > *really* makes sense, and that's the one where I looked at the code > and went "I *hope* this can be converted to traversing the dentry > tree", but at the same time it did look sensible to make it be about > inodes. > > If we can convert the quota side to be based on dentry lifetimes, it > will almost certainly then have to react to the places that do > "d_add()" when re-connecting an inode to a dentry at lookup time. > > So yeah, the quota code looks worse, but even if we could just remove > fsnotify and landlock, I'd still be much happier. > > Linus ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: lsm sb_delete hook, was Re: [PATCH 4/7] vfs: Convert sb->s_inodes iteration to super_iter_inodes() 2024-10-08 12:59 ` Mickaël Salaün @ 2024-10-09 0:21 ` Dave Chinner 2024-10-09 9:23 ` Mickaël Salaün 0 siblings, 1 reply; 34+ messages in thread From: Dave Chinner @ 2024-10-09 0:21 UTC (permalink / raw) To: Mickaël Salaün Cc: Linus Torvalds, Jan Kara, Christoph Hellwig, linux-fsdevel, linux-xfs, linux-bcachefs, kent.overstreet, Jann Horn, Serge Hallyn, Kees Cook, linux-security-module, Amir Goldstein, Günther Noack On Tue, Oct 08, 2024 at 02:59:07PM +0200, Mickaël Salaün wrote: > On Mon, Oct 07, 2024 at 05:28:57PM -0700, Linus Torvalds wrote: > > On Mon, 7 Oct 2024 at 16:33, Dave Chinner <david@fromorbit.com> wrote: > > > > > > There may be other inode references being held that make > > > the inode live longer than the dentry cache. When should the > > > fsnotify marks be removed from the inode in that case? Do they need > > > to remain until, e.g, writeback completes? > > > > Note that my idea is to just remove the fsnotify marks when the dentry > > discards the inode. > > > > That means that yes, the inode may still have a lifetime after the > > dentry (because of other references, _or_ just because I_DONTCACHE > > isn't set and we keep caching the inode). > > > > BUT - fsnotify won't care. There won't be any fsnotify marks on that > > inode any more, and without a dentry that points to it, there's no way > > to add such marks. > > > > (A new dentry may be re-attached to such an inode, and then fsnotify > > could re-add new marks, but that doesn't change anything - the next > > time the dentry is detached, the marks would go away again). > > > > And yes, this changes the timing on when fsnotify events happen, but > > what I'm actually hoping for is that Jan will agree that it doesn't > > actually matter semantically. > > > > > > Then at umount time, the dentry shrinking will deal with all live > > > > dentries, and at most the fsnotify layer would send the FS_UNMOUNT to > > > > just the root dentry inodes? > > > > > > I don't think even that is necessary, because > > > shrink_dcache_for_umount() drops the sb->s_root dentry after > > > trimming the dentry tree. Hence the dcache drop would cleanup all > > > inode references, roots included. > > > > Ahh - even better. > > > > I didn't actually look very closely at the actual umount path, I was > > looking just at the fsnotify_inoderemove() place in > > dentry_unlink_inode() and went "couldn't we do _this_ instead?" > > > > > > Wouldn't that make things much cleaner, and remove at least *one* odd > > > > use of the nasty s_inodes list? > > > > > > Yes, it would, but someone who knows exactly when the fsnotify > > > marks can be removed needs to chime in here... > > > > Yup. Honza? > > > > (Aside: I don't actually know if you prefer Jan or Honza, so I use > > both randomly and interchangeably?) > > > > > > I have this feeling that maybe we can just remove the other users too > > > > using similar models. I think the LSM layer use (in landlock) is bogus > > > > for exactly the same reason - there's really no reason to keep things > > > > around for a random cached inode without a dentry. > > > > > > Perhaps, but I'm not sure what the landlock code is actually trying > > > to do. > > In Landlock, inodes (see landlock_object) may be referenced by several > rulesets, either tied to a task's cred or a ruleset's file descriptor. > A ruleset may outlive its referenced inodes, and this should not block > related umounts. security_sb_delete() is used to gracefully release > such references. Ah, there's the problem. The ruleset is persistent, not the inode. Like fsnotify, the life cycle and reference counting is upside down. The inode should cache the ruleset rather than the ruleset pinning the inode. See my reply to Jan about fsnotify. > > Yeah, I wouldn't be surprised if it's just confused - it's very odd. > > > > But I'd be perfectly happy just removing one use at a time - even if > > we keep the s_inodes list around because of other users, it would > > still be "one less thing". > > > > > Hence, to me, the lifecycle and reference counting of inode related > > > objects in landlock doesn't seem quite right, and the use of the > > > security_sb_delete() callout appears to be papering over an internal > > > lifecycle issue. > > > > > > I'd love to get rid of it altogether. > > I'm not sure to fully understand the implications for now, but it would > definitely be good to simplify this lifetime management. The only > requirement for Landlock is that inodes references should live as long > as the related inodes are accessible by user space or already in use. > The sooner these references are removed from related ruleset, the > better. I'm missing something. Inodes are accessible to users even when they are not in cache - we just read them from disk and instantiate a new VFS inode. So how do you attach the correct ruleset to a newly instantiated inode? i.e. If you can find the ruleset for any given inode that is brought into cache (e.g. opening an existing, uncached file), then why do you need to take inode references so they are never evicted? -Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: lsm sb_delete hook, was Re: [PATCH 4/7] vfs: Convert sb->s_inodes iteration to super_iter_inodes() 2024-10-09 0:21 ` Dave Chinner @ 2024-10-09 9:23 ` Mickaël Salaün 0 siblings, 0 replies; 34+ messages in thread From: Mickaël Salaün @ 2024-10-09 9:23 UTC (permalink / raw) To: Dave Chinner Cc: Linus Torvalds, Jan Kara, Christoph Hellwig, linux-fsdevel, linux-xfs, linux-bcachefs, kent.overstreet, Jann Horn, Serge Hallyn, Kees Cook, linux-security-module, Amir Goldstein, Günther Noack, Christian Brauner On Wed, Oct 09, 2024 at 11:21:10AM +1100, Dave Chinner wrote: > On Tue, Oct 08, 2024 at 02:59:07PM +0200, Mickaël Salaün wrote: > > On Mon, Oct 07, 2024 at 05:28:57PM -0700, Linus Torvalds wrote: > > > On Mon, 7 Oct 2024 at 16:33, Dave Chinner <david@fromorbit.com> wrote: > > > > > > > > There may be other inode references being held that make > > > > the inode live longer than the dentry cache. When should the > > > > fsnotify marks be removed from the inode in that case? Do they need > > > > to remain until, e.g, writeback completes? > > > > > > Note that my idea is to just remove the fsnotify marks when the dentry > > > discards the inode. > > > > > > That means that yes, the inode may still have a lifetime after the > > > dentry (because of other references, _or_ just because I_DONTCACHE > > > isn't set and we keep caching the inode). > > > > > > BUT - fsnotify won't care. There won't be any fsnotify marks on that > > > inode any more, and without a dentry that points to it, there's no way > > > to add such marks. > > > > > > (A new dentry may be re-attached to such an inode, and then fsnotify > > > could re-add new marks, but that doesn't change anything - the next > > > time the dentry is detached, the marks would go away again). > > > > > > And yes, this changes the timing on when fsnotify events happen, but > > > what I'm actually hoping for is that Jan will agree that it doesn't > > > actually matter semantically. > > > > > > > > Then at umount time, the dentry shrinking will deal with all live > > > > > dentries, and at most the fsnotify layer would send the FS_UNMOUNT to > > > > > just the root dentry inodes? > > > > > > > > I don't think even that is necessary, because > > > > shrink_dcache_for_umount() drops the sb->s_root dentry after > > > > trimming the dentry tree. Hence the dcache drop would cleanup all > > > > inode references, roots included. > > > > > > Ahh - even better. > > > > > > I didn't actually look very closely at the actual umount path, I was > > > looking just at the fsnotify_inoderemove() place in > > > dentry_unlink_inode() and went "couldn't we do _this_ instead?" > > > > > > > > Wouldn't that make things much cleaner, and remove at least *one* odd > > > > > use of the nasty s_inodes list? > > > > > > > > Yes, it would, but someone who knows exactly when the fsnotify > > > > marks can be removed needs to chime in here... > > > > > > Yup. Honza? > > > > > > (Aside: I don't actually know if you prefer Jan or Honza, so I use > > > both randomly and interchangeably?) > > > > > > > > I have this feeling that maybe we can just remove the other users too > > > > > using similar models. I think the LSM layer use (in landlock) is bogus > > > > > for exactly the same reason - there's really no reason to keep things > > > > > around for a random cached inode without a dentry. > > > > > > > > Perhaps, but I'm not sure what the landlock code is actually trying > > > > to do. > > > > In Landlock, inodes (see landlock_object) may be referenced by several > > rulesets, either tied to a task's cred or a ruleset's file descriptor. > > A ruleset may outlive its referenced inodes, and this should not block > > related umounts. security_sb_delete() is used to gracefully release > > such references. > > Ah, there's the problem. The ruleset is persistent, not the inode. > Like fsnotify, the life cycle and reference counting is upside down. > The inode should cache the ruleset rather than the ruleset pinning > the inode. A ruleset needs to takes a reference to the inode as for an opened file and keep it "alive" as long as it may be re-used by user space (i.e. as long as the superblock exists). One of the goal of a ruleset is to identify inodes as long as they are accessible. When a sandboxed process request to open a file, its sandbox's ruleset checks against the referenced inodes (in a nutshell). In practice, rulesets reference a set of struct landlock_object which references an inode or not (if it vanished). There is only one landlock_object referenced per inode. This makes it possible to have a dynamic N:M mapping between rulesets and inodes which enables a ruleset to be deleted before its referenced inodes, or the other way around. > > See my reply to Jan about fsnotify. > > > > Yeah, I wouldn't be surprised if it's just confused - it's very odd. > > > > > > But I'd be perfectly happy just removing one use at a time - even if > > > we keep the s_inodes list around because of other users, it would > > > still be "one less thing". > > > > > > > Hence, to me, the lifecycle and reference counting of inode related > > > > objects in landlock doesn't seem quite right, and the use of the > > > > security_sb_delete() callout appears to be papering over an internal > > > > lifecycle issue. > > > > > > > > I'd love to get rid of it altogether. > > > > I'm not sure to fully understand the implications for now, but it would > > definitely be good to simplify this lifetime management. The only > > requirement for Landlock is that inodes references should live as long > > as the related inodes are accessible by user space or already in use. > > The sooner these references are removed from related ruleset, the > > better. > > I'm missing something. Inodes are accessible to users even when > they are not in cache - we just read them from disk and instantiate > a new VFS inode. > > So how do you attach the correct ruleset to a newly instantiated > inode? We can see a Landlock ruleset as a set of weakly opened files/inodes. A Landolck ruleset call iget() to keep the related VFS inodes alive, which means that when user space opens a file pointing to the same inode, the same VFS inode will be re-used and then we can match it against a ruleset. > > i.e. If you can find the ruleset for any given inode that is brought > into cache (e.g. opening an existing, uncached file), then why do > you need to take inode references so they are never evicted? A landlock_object only keep a reference to an inode, not to the rulesets pointing to it: * inode -> 1 landlock_object or NULL * landlock_object -> 1 inode or NULL * ruleset -> N landlock_object There are mainly two different operations: 1. Match 1 inode against a set of N inode references (i.e. a ruleset). 2. Drop the references of N rulesets (in practice 1 intermediate landlock_object) pointing to 1 inode. > > -Dave. > -- > Dave Chinner > david@fromorbit.com > ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: lsm sb_delete hook, was Re: [PATCH 4/7] vfs: Convert sb->s_inodes iteration to super_iter_inodes() 2024-10-07 23:33 ` Dave Chinner 2024-10-08 0:28 ` Linus Torvalds @ 2024-10-08 8:57 ` Amir Goldstein 2024-10-08 11:23 ` Jan Kara 1 sibling, 1 reply; 34+ messages in thread From: Amir Goldstein @ 2024-10-08 8:57 UTC (permalink / raw) To: Dave Chinner Cc: Linus Torvalds, Jan Kara, Christoph Hellwig, linux-fsdevel, linux-xfs, linux-bcachefs, kent.overstreet, Mickaël Salaün, Jann Horn, Serge Hallyn, Kees Cook, linux-security-module On Tue, Oct 8, 2024 at 1:33 AM Dave Chinner <david@fromorbit.com> wrote: > > On Mon, Oct 07, 2024 at 01:37:19PM -0700, Linus Torvalds wrote: > > On Thu, 3 Oct 2024 at 04:57, Jan Kara <jack@suse.cz> wrote: > > > > > > Fair enough. If we go with the iterator variant I've suggested to Dave in > > > [1], we could combine the evict_inodes(), fsnotify_unmount_inodes() and > > > Landlocks hook_sb_delete() into a single iteration relatively easily. But > > > I'd wait with that convertion until this series lands. > > > > Honza, I looked at this a bit more, particularly with an eye of "what > > happens if we just end up making the inode lifetimes subject to the > > dentry lifetimes" as suggested by Dave elsewhere. > > .... > > > which makes the fsnotify_inode_delete() happen when the inode is > > removed from the dentry. > > There may be other inode references being held that make > the inode live longer than the dentry cache. When should the > fsnotify marks be removed from the inode in that case? Do they need > to remain until, e.g, writeback completes? > fsnotify inode marks remain until explicitly removed or until sb is unmounted (*), so other inode references are irrelevant to inode mark removal. (*) fanotify has "evictable" inode marks, which do not hold inode reference and go away on inode evict, but those mark evictions do not generate any event (i.e. there is no FAN_UNMOUNT). > > Then at umount time, the dentry shrinking will deal with all live > > dentries, and at most the fsnotify layer would send the FS_UNMOUNT to > > just the root dentry inodes? > > I don't think even that is necessary, because > shrink_dcache_for_umount() drops the sb->s_root dentry after > trimming the dentry tree. Hence the dcache drop would cleanup all > inode references, roots included. > > > Wouldn't that make things much cleaner, and remove at least *one* odd > > use of the nasty s_inodes list? > > Yes, it would, but someone who knows exactly when the fsnotify > marks can be removed needs to chime in here... > > > I have this feeling that maybe we can just remove the other users too > > using similar models. I think the LSM layer use (in landlock) is bogus > > for exactly the same reason - there's really no reason to keep things > > around for a random cached inode without a dentry. > > Perhaps, but I'm not sure what the landlock code is actually trying > to do. It seems to be trying to avoid races between syscalls > releasing inode references and unmount calling security_sb_delete() > to clean up inode references that it has leaked. This implies that > it's not a) tracking inodes itself, and b) not cleaning up internal > state early enough in unmount. > > Hence, to me, the lifecycle and reference counting of inode related > objects in landlock doesn't seem quite right, and the use of the > security_sb_delete() callout appears to be papering over an internal > lifecycle issue. > > I'd love to get rid of it altogether. > > > And I wonder if the quota code (which uses the s_inodes list to enable > > quotas on already mounted filesystems) could for all the same reasons > > just walk the dentry tree instead (and remove_dquot_ref similarly > > could just remove it at dentry_unlink_inode() time)? > > I don't think that will work because we have to be able to modify > quota in evict() processing. This is especially true for unlinked > inodes being evicted from cache, but also the dquots need to stay > attached until writeback completes. > > Hence I don't think we can remove the quota refs from the inode > before we call iput_final(), and so I think quotaoff (at least) > still needs to iterate inodes... > > > It really feels like most (all?) of the s_inode list users are > > basically historical, and shouldn't use that list at all. And there > > aren't _that_ many of them. I think Dave was right in just saying that > > this list should go away entirely (or was it somebody else who made > > that comment?) > > Yeah, I said that it should go away entirely. > > My view of this whole s_inodes list is that subsystems that are > taking references to inodes *must* track or manage the references to > the inodes themselves. > > The canonical example is the VFS itself: evict_inodes() doesn't need > to iterate s_inodes at all. It can walk the inode LRU to purge all > the unreferenced cached inodes from memory. iput_final() guarantees > that all unreferenced inodes are either put on the LRU or torn down > immediately. > > Hence I think that it is a poor architectural decision to require > superblock teardown to clean up inode references random subsystems > have *leaked* to prevent UAFs. It forces the sb to track all > inodes whether the VFS actually needs to track them or not. > For fsnotify, I think we can/should maintain a list of marked inodes inside sb->s_fsnotify_info, we can iterate this private list in fsnotify_unmount_inodes() to remove the marks. TBH, I am not sure I understand the suggested change for inode lifetime. An inode can have a reference from dentry or from some subsystem (e.g. fsnotify) which is responsible for putting their held reference before unmount. What is the alternative? Thanks, Amir. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: lsm sb_delete hook, was Re: [PATCH 4/7] vfs: Convert sb->s_inodes iteration to super_iter_inodes() 2024-10-08 8:57 ` Amir Goldstein @ 2024-10-08 11:23 ` Jan Kara 2024-10-08 12:16 ` Christian Brauner 2024-10-08 23:44 ` Dave Chinner 0 siblings, 2 replies; 34+ messages in thread From: Jan Kara @ 2024-10-08 11:23 UTC (permalink / raw) To: Amir Goldstein Cc: Dave Chinner, Linus Torvalds, Jan Kara, Christoph Hellwig, linux-fsdevel, linux-xfs, linux-bcachefs, kent.overstreet, Mickaël Salaün, Jann Horn, Serge Hallyn, Kees Cook, linux-security-module On Tue 08-10-24 10:57:22, Amir Goldstein wrote: > On Tue, Oct 8, 2024 at 1:33 AM Dave Chinner <david@fromorbit.com> wrote: > > > > On Mon, Oct 07, 2024 at 01:37:19PM -0700, Linus Torvalds wrote: > > > On Thu, 3 Oct 2024 at 04:57, Jan Kara <jack@suse.cz> wrote: > > > > > > > > Fair enough. If we go with the iterator variant I've suggested to Dave in > > > > [1], we could combine the evict_inodes(), fsnotify_unmount_inodes() and > > > > Landlocks hook_sb_delete() into a single iteration relatively easily. But > > > > I'd wait with that convertion until this series lands. > > > > > > Honza, I looked at this a bit more, particularly with an eye of "what > > > happens if we just end up making the inode lifetimes subject to the > > > dentry lifetimes" as suggested by Dave elsewhere. > > > > .... > > > > > which makes the fsnotify_inode_delete() happen when the inode is > > > removed from the dentry. > > > > There may be other inode references being held that make > > the inode live longer than the dentry cache. When should the > > fsnotify marks be removed from the inode in that case? Do they need > > to remain until, e.g, writeback completes? > > > > fsnotify inode marks remain until explicitly removed or until sb > is unmounted (*), so other inode references are irrelevant to > inode mark removal. > > (*) fanotify has "evictable" inode marks, which do not hold inode > reference and go away on inode evict, but those mark evictions > do not generate any event (i.e. there is no FAN_UNMOUNT). Yes. Amir beat me with the response so let me just add that FS_UMOUNT event is for inotify which guarantees that either you get an event about somebody unlinking the inode (e.g. IN_DELETE_SELF) or event about filesystem being unmounted (IN_UMOUNT) if you place mark on some inode. I also don't see how we would maintain this behavior with what Linus proposes. > > > Then at umount time, the dentry shrinking will deal with all live > > > dentries, and at most the fsnotify layer would send the FS_UNMOUNT to > > > just the root dentry inodes? > > > > I don't think even that is necessary, because > > shrink_dcache_for_umount() drops the sb->s_root dentry after > > trimming the dentry tree. Hence the dcache drop would cleanup all > > inode references, roots included. > > > > > Wouldn't that make things much cleaner, and remove at least *one* odd > > > use of the nasty s_inodes list? > > > > Yes, it would, but someone who knows exactly when the fsnotify > > marks can be removed needs to chime in here... So fsnotify needs a list of inodes for the superblock which have marks attached and for which we hold inode reference. We can keep it inside fsnotify code although it would practically mean another list_head for the inode for this list (probably in our fsnotify_connector structure which connects list of notification marks to the inode). If we actually get rid of i_sb_list in struct inode, this will be a win for the overall system, otherwise it is a net loss IMHO. So if we can figure out how to change other s_inodes owners we can certainly do this fsnotify change. > > > And I wonder if the quota code (which uses the s_inodes list to enable > > > quotas on already mounted filesystems) could for all the same reasons > > > just walk the dentry tree instead (and remove_dquot_ref similarly > > > could just remove it at dentry_unlink_inode() time)? > > > > I don't think that will work because we have to be able to modify > > quota in evict() processing. This is especially true for unlinked > > inodes being evicted from cache, but also the dquots need to stay > > attached until writeback completes. > > > > Hence I don't think we can remove the quota refs from the inode > > before we call iput_final(), and so I think quotaoff (at least) > > still needs to iterate inodes... Yeah, I'm not sure how to get rid of the s_inodes use in quota code. One of the things we need s_inodes list for is during quotaoff on a mounted filesystem when we need to iterate all inodes which are referencing quota structures and free them. In theory we could keep a list of inodes referencing quota structures but that would require adding list_head to inode structure for filesystems that support quotas. Now for the sake of full context I'll also say that enabling / disabling quotas on a mounted filesystem is a legacy feature because it is quite easy that quota accounting goes wrong with it. So ext4 and f2fs support for quite a few years a mode where quota tracking is enabled on mount and disabled on unmount (if appropriate fs feature is enabled) and you can only enable / disable enforcement of quota limits during runtime. So I could see us deprecating this functionality altogether although jfs never adapted to this new way we do quotas so we'd have to deal with that somehow. But one way or another it would take a significant amount of time before we can completely remove this so it is out of question for this series. I see one problem with the idea "whoever has a need to iterate inodes needs to keep track of inodes it needs to iterate through". It is fine conceptually but with s_inodes list we pay the cost only once and multiple users benefit. With each subsystem tracking inodes we pay the cost for each user (both in terms of memory and CPU). So if you don't use any of the subsystems that need iteration, you win, but if you use two or more of these subsystems, in particular those which need to track significant portion of all inodes, you are losing. Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: lsm sb_delete hook, was Re: [PATCH 4/7] vfs: Convert sb->s_inodes iteration to super_iter_inodes() 2024-10-08 11:23 ` Jan Kara @ 2024-10-08 12:16 ` Christian Brauner 2024-10-09 0:03 ` Dave Chinner 2024-10-08 23:44 ` Dave Chinner 1 sibling, 1 reply; 34+ messages in thread From: Christian Brauner @ 2024-10-08 12:16 UTC (permalink / raw) To: Jan Kara Cc: Amir Goldstein, Dave Chinner, Linus Torvalds, Christoph Hellwig, linux-fsdevel, linux-xfs, linux-bcachefs, kent.overstreet, Mickaël Salaün, Jann Horn, Serge Hallyn, Kees Cook, linux-security-module On Tue, Oct 08, 2024 at 01:23:44PM GMT, Jan Kara wrote: > On Tue 08-10-24 10:57:22, Amir Goldstein wrote: > > On Tue, Oct 8, 2024 at 1:33 AM Dave Chinner <david@fromorbit.com> wrote: > > > > > > On Mon, Oct 07, 2024 at 01:37:19PM -0700, Linus Torvalds wrote: > > > > On Thu, 3 Oct 2024 at 04:57, Jan Kara <jack@suse.cz> wrote: > > > > > > > > > > Fair enough. If we go with the iterator variant I've suggested to Dave in > > > > > [1], we could combine the evict_inodes(), fsnotify_unmount_inodes() and > > > > > Landlocks hook_sb_delete() into a single iteration relatively easily. But > > > > > I'd wait with that convertion until this series lands. > > > > > > > > Honza, I looked at this a bit more, particularly with an eye of "what > > > > happens if we just end up making the inode lifetimes subject to the > > > > dentry lifetimes" as suggested by Dave elsewhere. > > > > > > .... > > > > > > > which makes the fsnotify_inode_delete() happen when the inode is > > > > removed from the dentry. > > > > > > There may be other inode references being held that make > > > the inode live longer than the dentry cache. When should the > > > fsnotify marks be removed from the inode in that case? Do they need > > > to remain until, e.g, writeback completes? > > > > > > > fsnotify inode marks remain until explicitly removed or until sb > > is unmounted (*), so other inode references are irrelevant to > > inode mark removal. > > > > (*) fanotify has "evictable" inode marks, which do not hold inode > > reference and go away on inode evict, but those mark evictions > > do not generate any event (i.e. there is no FAN_UNMOUNT). > > Yes. Amir beat me with the response so let me just add that FS_UMOUNT event > is for inotify which guarantees that either you get an event about somebody > unlinking the inode (e.g. IN_DELETE_SELF) or event about filesystem being > unmounted (IN_UMOUNT) if you place mark on some inode. I also don't see how > we would maintain this behavior with what Linus proposes. > > > > > Then at umount time, the dentry shrinking will deal with all live > > > > dentries, and at most the fsnotify layer would send the FS_UNMOUNT to > > > > just the root dentry inodes? > > > > > > I don't think even that is necessary, because > > > shrink_dcache_for_umount() drops the sb->s_root dentry after > > > trimming the dentry tree. Hence the dcache drop would cleanup all > > > inode references, roots included. > > > > > > > Wouldn't that make things much cleaner, and remove at least *one* odd > > > > use of the nasty s_inodes list? > > > > > > Yes, it would, but someone who knows exactly when the fsnotify > > > marks can be removed needs to chime in here... > > So fsnotify needs a list of inodes for the superblock which have marks > attached and for which we hold inode reference. We can keep it inside > fsnotify code although it would practically mean another list_head for the > inode for this list (probably in our fsnotify_connector structure which > connects list of notification marks to the inode). If we actually get rid > of i_sb_list in struct inode, this will be a win for the overall system, > otherwise it is a net loss IMHO. So if we can figure out how to change > other s_inodes owners we can certainly do this fsnotify change. > > > > > And I wonder if the quota code (which uses the s_inodes list to enable > > > > quotas on already mounted filesystems) could for all the same reasons > > > > just walk the dentry tree instead (and remove_dquot_ref similarly > > > > could just remove it at dentry_unlink_inode() time)? > > > > > > I don't think that will work because we have to be able to modify > > > quota in evict() processing. This is especially true for unlinked > > > inodes being evicted from cache, but also the dquots need to stay > > > attached until writeback completes. > > > > > > Hence I don't think we can remove the quota refs from the inode > > > before we call iput_final(), and so I think quotaoff (at least) > > > still needs to iterate inodes... > > Yeah, I'm not sure how to get rid of the s_inodes use in quota code. One of > the things we need s_inodes list for is during quotaoff on a mounted > filesystem when we need to iterate all inodes which are referencing quota > structures and free them. In theory we could keep a list of inodes > referencing quota structures but that would require adding list_head to > inode structure for filesystems that support quotas. Now for the sake of > full context I'll also say that enabling / disabling quotas on a mounted > filesystem is a legacy feature because it is quite easy that quota > accounting goes wrong with it. So ext4 and f2fs support for quite a few > years a mode where quota tracking is enabled on mount and disabled on > unmount (if appropriate fs feature is enabled) and you can only enable / > disable enforcement of quota limits during runtime. So I could see us > deprecating this functionality altogether although jfs never adapted to > this new way we do quotas so we'd have to deal with that somehow. But one > way or another it would take a significant amount of time before we can > completely remove this so it is out of question for this series. I still maintain that we don't need to solve the fsnotify and lsm rework as part of this particular series. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: lsm sb_delete hook, was Re: [PATCH 4/7] vfs: Convert sb->s_inodes iteration to super_iter_inodes() 2024-10-08 12:16 ` Christian Brauner @ 2024-10-09 0:03 ` Dave Chinner 0 siblings, 0 replies; 34+ messages in thread From: Dave Chinner @ 2024-10-09 0:03 UTC (permalink / raw) To: Christian Brauner Cc: Jan Kara, Amir Goldstein, Linus Torvalds, Christoph Hellwig, linux-fsdevel, linux-xfs, linux-bcachefs, kent.overstreet, Mickaël Salaün, Jann Horn, Serge Hallyn, Kees Cook, linux-security-module On Tue, Oct 08, 2024 at 02:16:04PM +0200, Christian Brauner wrote: > I still maintain that we don't need to solve the fsnotify and lsm rework > as part of this particular series. Sure, I heard you the first time. :) However, the patchset I posted was just a means to start the discussion with a concrete proposal. Now I'm trying to work out how all the pieces of the bigger puzzle go together as people think about what it means, not polish the first little step. -Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: lsm sb_delete hook, was Re: [PATCH 4/7] vfs: Convert sb->s_inodes iteration to super_iter_inodes() 2024-10-08 11:23 ` Jan Kara 2024-10-08 12:16 ` Christian Brauner @ 2024-10-08 23:44 ` Dave Chinner 2024-10-09 6:10 ` Amir Goldstein 2024-10-09 14:18 ` Jan Kara 1 sibling, 2 replies; 34+ messages in thread From: Dave Chinner @ 2024-10-08 23:44 UTC (permalink / raw) To: Jan Kara Cc: Amir Goldstein, Linus Torvalds, Christoph Hellwig, linux-fsdevel, linux-xfs, linux-bcachefs, kent.overstreet, Mickaël Salaün, Jann Horn, Serge Hallyn, Kees Cook, linux-security-module On Tue, Oct 08, 2024 at 01:23:44PM +0200, Jan Kara wrote: > On Tue 08-10-24 10:57:22, Amir Goldstein wrote: > > On Tue, Oct 8, 2024 at 1:33 AM Dave Chinner <david@fromorbit.com> wrote: > > > > > > On Mon, Oct 07, 2024 at 01:37:19PM -0700, Linus Torvalds wrote: > > > > On Thu, 3 Oct 2024 at 04:57, Jan Kara <jack@suse.cz> wrote: > > > > > > > > > > Fair enough. If we go with the iterator variant I've suggested to Dave in > > > > > [1], we could combine the evict_inodes(), fsnotify_unmount_inodes() and > > > > > Landlocks hook_sb_delete() into a single iteration relatively easily. But > > > > > I'd wait with that convertion until this series lands. > > > > > > > > Honza, I looked at this a bit more, particularly with an eye of "what > > > > happens if we just end up making the inode lifetimes subject to the > > > > dentry lifetimes" as suggested by Dave elsewhere. > > > > > > .... > > > > > > > which makes the fsnotify_inode_delete() happen when the inode is > > > > removed from the dentry. > > > > > > There may be other inode references being held that make > > > the inode live longer than the dentry cache. When should the > > > fsnotify marks be removed from the inode in that case? Do they need > > > to remain until, e.g, writeback completes? > > > > > > > fsnotify inode marks remain until explicitly removed or until sb > > is unmounted (*), so other inode references are irrelevant to > > inode mark removal. > > > > (*) fanotify has "evictable" inode marks, which do not hold inode > > reference and go away on inode evict, but those mark evictions > > do not generate any event (i.e. there is no FAN_UNMOUNT). > > Yes. Amir beat me with the response so let me just add that FS_UMOUNT event > is for inotify which guarantees that either you get an event about somebody > unlinking the inode (e.g. IN_DELETE_SELF) or event about filesystem being > unmounted (IN_UMOUNT) if you place mark on some inode. I also don't see how > we would maintain this behavior with what Linus proposes. Thanks. I didn't respond last night when I read Amir's decription because I wanted to think it over. Knowing where the unmount event requirement certainly helps. I am probably missing something important, but it really seems to me that the object reference counting model is the back to front. Currently the mark is being attached to the inode and then the inode pinned by a reference count to make the mark attached to the inode persistent until unmount. This then requires the inodes to be swept by unmount because fsnotify has effectively leaked them as it isn't tracking such inodes itself. [ Keep in mind that I'm not saying this was a bad or wrong thing to do because the s_inodes list was there to be able to do this sort of lazy cleanup. But now that we want to remove the s_inodes list if at all possible, it is a problem we need to solve differently. ] AFAICT, inotify does not appear to require the inode to send events - it only requires access to the inode mark itself. Hence it does not the inode in cache to generate IN_UNMOUNT events, it just needs the mark itself to be findable at unmount. Do any of the other backends that require unmount notifications that require special access to the inode itself? If not, and the fsnotify sb info is tracking these persistent marks, then we don't need to iterate inodes at unmount. This means we don't need to pin inodes when they have marks attached, and so the dependency on the s_inodes list goes away. With this inverted model, we need the first fsnotify event callout after the inode is instantiated to look for a persistent mark for the inode. We know how to do this efficiently - it's exactly the same caching model we use for ACLs. On the first lookup, we check the inode for ACL data and set the ACL pointer appropriately to indicate that a lookup has been done and there are no ACLs associated with the inode. At this point, the fsnotify inode marks can all be removed from the inode when it is being evicted and there's no need for fsnotify to pin inodes at all. > > > > Then at umount time, the dentry shrinking will deal with all live > > > > dentries, and at most the fsnotify layer would send the FS_UNMOUNT to > > > > just the root dentry inodes? > > > > > > I don't think even that is necessary, because > > > shrink_dcache_for_umount() drops the sb->s_root dentry after > > > trimming the dentry tree. Hence the dcache drop would cleanup all > > > inode references, roots included. > > > > > > > Wouldn't that make things much cleaner, and remove at least *one* odd > > > > use of the nasty s_inodes list? > > > > > > Yes, it would, but someone who knows exactly when the fsnotify > > > marks can be removed needs to chime in here... > > So fsnotify needs a list of inodes for the superblock which have marks > attached and for which we hold inode reference. We can keep it inside > fsnotify code although it would practically mean another list_head for the > inode for this list (probably in our fsnotify_connector structure which > connects list of notification marks to the inode). I don't think that is necessary. We need to get rid of the inode reference, not move where we track inode references. The persistent object is the fsnotify mark, not the cached inode. It's the mark that needs to be persistent, and that's what the fsnotify code should be tracking. The fsnotify marks are much smaller than inodes, and there going to be fewer cached marks than inodes, especially once inode pinning is removed. Hence I think this will result in a net reduction in memory footprint for "marked-until-unmount" configurations as we won't pin nearly as many inodes in cache... > If we actually get rid > of i_sb_list in struct inode, this will be a win for the overall system, > otherwise it is a net loss IMHO. So if we can figure out how to change > other s_inodes owners we can certainly do this fsnotify change. Yes, I am exploring what it would take to get rid of i_sb_list altogether right now. That, I don't think this is a concern given the difference in memory footprint of the same number of persistent marks. i.e. "persistent mark, reclaimable inode" will always have a significantly lower memory footprint than "persistent inode and mark" under memory pressure.... > > > > And I wonder if the quota code (which uses the s_inodes list > > > > to enable quotas on already mounted filesystems) could for > > > > all the same reasons just walk the dentry tree instead (and > > > > remove_dquot_ref similarly could just remove it at > > > > dentry_unlink_inode() time)? > > > > > > I don't think that will work because we have to be able to > > > modify quota in evict() processing. This is especially true > > > for unlinked inodes being evicted from cache, but also the > > > dquots need to stay attached until writeback completes. > > > > > > Hence I don't think we can remove the quota refs from the > > > inode before we call iput_final(), and so I think quotaoff (at > > > least) still needs to iterate inodes... > > Yeah, I'm not sure how to get rid of the s_inodes use in quota > code. One of the things we need s_inodes list for is during > quotaoff on a mounted filesystem when we need to iterate all > inodes which are referencing quota structures and free them. In > theory we could keep a list of inodes referencing quota structures > but that would require adding list_head to inode structure for > filesystems that support quotas. I don't think that's quite true. Quota is not modular, so we can lazily free quota objects even when quota is turned off. All we need to ensure is that code checks whether quota is enabled, not for the existence of quota objects attached to the inode. Hence quota-off simply turns off all the quota operations in memory, and normal inode eviction cleans up the stale quota objects naturally. My main question is why the quota-on add_dquot_ref() pass is required. AFAICT all of the filesystem operations that will modify quota call dquot_initialize() directly to attach the required dquots to the inode before the operation is started. If that's true, then why does quota-on need to do this for all the inodes that are already in cache? i.e. I'm not sure I understand why we need quota to do these iterations at all... > Now for the sake of > full context I'll also say that enabling / disabling quotas on a mounted > filesystem is a legacy feature because it is quite easy that quota > accounting goes wrong with it. So ext4 and f2fs support for quite a few > years a mode where quota tracking is enabled on mount and disabled on > unmount (if appropriate fs feature is enabled) and you can only enable / > disable enforcement of quota limits during runtime. Sure, this is how XFS works, too. But I think this behaviour is largely irrelevant because there are still filesystems out there that do stuff the old way... > So I could see us > deprecating this functionality altogether although jfs never adapted to > this new way we do quotas so we'd have to deal with that somehow. But one > way or another it would take a significant amount of time before we can > completely remove this so it is out of question for this series. I'm not sure that matters, though it adds to the reasons why we should be removing old, unmaintained filesystems from the tree and old, outdated formats from maintained filesystems.... > I see one problem with the idea "whoever has a need to iterate inodes needs > to keep track of inodes it needs to iterate through". It is fine > conceptually but with s_inodes list we pay the cost only once and multiple > users benefit. With each subsystem tracking inodes we pay the cost for each > user (both in terms of memory and CPU). So if you don't use any of the > subsystems that need iteration, you win, but if you use two or more of > these subsystems, in particular those which need to track significant > portion of all inodes, you are losing. AFAICT, most of the subsystems don't need to track inodes directly. We don't need s_inodes for evict_inodes() - we have the inode LRU tracking all unreferenced inodes on the superblock. The GFS2 use case can probably walk the inode LRU directly, too. It looks to me that we can avoid needing unmount iteration for fsnotify, and I suspect landlock can likely use the same persistence inversion as fsnotify (same persistent ruleset model). The bdev superblock can implement it's own internal list using inode->i_devices as this list_head is only used by chardev inodes. All that then remains is the page cache dropping code, and that's not really critical to have exacting behaviour. We certainly shouldn't be taking a runtime penalty just to optimise the rare case of dropping caches.. IOWs, there aren't that many users, and I think there are ways to make all these iterations go away without adding new per-inode list heads to track inodes. -Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: lsm sb_delete hook, was Re: [PATCH 4/7] vfs: Convert sb->s_inodes iteration to super_iter_inodes() 2024-10-08 23:44 ` Dave Chinner @ 2024-10-09 6:10 ` Amir Goldstein 2024-10-09 14:18 ` Jan Kara 1 sibling, 0 replies; 34+ messages in thread From: Amir Goldstein @ 2024-10-09 6:10 UTC (permalink / raw) To: Dave Chinner Cc: Jan Kara, Linus Torvalds, Christoph Hellwig, linux-fsdevel, linux-xfs, linux-bcachefs, kent.overstreet, Mickaël Salaün, Jann Horn, Serge Hallyn, Kees Cook, linux-security-module On Wed, Oct 9, 2024 at 1:44 AM Dave Chinner <david@fromorbit.com> wrote: > > On Tue, Oct 08, 2024 at 01:23:44PM +0200, Jan Kara wrote: > > On Tue 08-10-24 10:57:22, Amir Goldstein wrote: > > > On Tue, Oct 8, 2024 at 1:33 AM Dave Chinner <david@fromorbit.com> wrote: > > > > > > > > On Mon, Oct 07, 2024 at 01:37:19PM -0700, Linus Torvalds wrote: > > > > > On Thu, 3 Oct 2024 at 04:57, Jan Kara <jack@suse.cz> wrote: > > > > > > > > > > > > Fair enough. If we go with the iterator variant I've suggested to Dave in > > > > > > [1], we could combine the evict_inodes(), fsnotify_unmount_inodes() and > > > > > > Landlocks hook_sb_delete() into a single iteration relatively easily. But > > > > > > I'd wait with that convertion until this series lands. > > > > > > > > > > Honza, I looked at this a bit more, particularly with an eye of "what > > > > > happens if we just end up making the inode lifetimes subject to the > > > > > dentry lifetimes" as suggested by Dave elsewhere. > > > > > > > > .... > > > > > > > > > which makes the fsnotify_inode_delete() happen when the inode is > > > > > removed from the dentry. > > > > > > > > There may be other inode references being held that make > > > > the inode live longer than the dentry cache. When should the > > > > fsnotify marks be removed from the inode in that case? Do they need > > > > to remain until, e.g, writeback completes? > > > > > > > > > > fsnotify inode marks remain until explicitly removed or until sb > > > is unmounted (*), so other inode references are irrelevant to > > > inode mark removal. > > > > > > (*) fanotify has "evictable" inode marks, which do not hold inode > > > reference and go away on inode evict, but those mark evictions > > > do not generate any event (i.e. there is no FAN_UNMOUNT). > > > > Yes. Amir beat me with the response so let me just add that FS_UMOUNT event > > is for inotify which guarantees that either you get an event about somebody > > unlinking the inode (e.g. IN_DELETE_SELF) or event about filesystem being > > unmounted (IN_UMOUNT) if you place mark on some inode. I also don't see how > > we would maintain this behavior with what Linus proposes. > > Thanks. I didn't respond last night when I read Amir's decription > because I wanted to think it over. Knowing where the unmount event > requirement certainly helps. > > I am probably missing something important, but it really seems to me > that the object reference counting model is the back to > front. Currently the mark is being attached to the inode and then > the inode pinned by a reference count to make the mark attached > to the inode persistent until unmount. This then requires the inodes > to be swept by unmount because fsnotify has effectively leaked them > as it isn't tracking such inodes itself. > > [ Keep in mind that I'm not saying this was a bad or wrong thing to > do because the s_inodes list was there to be able to do this sort of > lazy cleanup. But now that we want to remove the s_inodes list if at > all possible, it is a problem we need to solve differently. ] > > AFAICT, inotify does not appear to require the inode to send events > - it only requires access to the inode mark itself. Hence it does > not the inode in cache to generate IN_UNMOUNT events, it just > needs the mark itself to be findable at unmount. Do any of the > other backends that require unmount notifications that require > special access to the inode itself? > No other backend supports IN_UNMOUNT/FS_UNMOUNT. We want to add unmount events support to fanotify, but those are only going to be possible for watching a mount or an sb, not inodes. > If not, and the fsnotify sb info is tracking these persistent marks, > then we don't need to iterate inodes at unmount. This means we don't > need to pin inodes when they have marks attached, and so the > dependency on the s_inodes list goes away. > > With this inverted model, we need the first fsnotify event callout > after the inode is instantiated to look for a persistent mark for > the inode. We know how to do this efficiently - it's exactly the > same caching model we use for ACLs. On the first lookup, we check > the inode for ACL data and set the ACL pointer appropriately to > indicate that a lookup has been done and there are no ACLs > associated with the inode. > > At this point, the fsnotify inode marks can all be removed from the > inode when it is being evicted and there's no need for fsnotify to > pin inodes at all. > > > > > > Then at umount time, the dentry shrinking will deal with all live > > > > > dentries, and at most the fsnotify layer would send the FS_UNMOUNT to > > > > > just the root dentry inodes? > > > > > > > > I don't think even that is necessary, because > > > > shrink_dcache_for_umount() drops the sb->s_root dentry after > > > > trimming the dentry tree. Hence the dcache drop would cleanup all > > > > inode references, roots included. > > > > > > > > > Wouldn't that make things much cleaner, and remove at least *one* odd > > > > > use of the nasty s_inodes list? > > > > > > > > Yes, it would, but someone who knows exactly when the fsnotify > > > > marks can be removed needs to chime in here... > > > > So fsnotify needs a list of inodes for the superblock which have marks > > attached and for which we hold inode reference. We can keep it inside > > fsnotify code although it would practically mean another list_head for the > > inode for this list (probably in our fsnotify_connector structure which > > connects list of notification marks to the inode). > > I don't think that is necessary. We need to get rid of the inode > reference, not move where we track inode references. The persistent > object is the fsnotify mark, not the cached inode. It's the mark > that needs to be persistent, and that's what the fsnotify code > should be tracking. > > The fsnotify marks are much smaller than inodes, and there going to > be fewer cached marks than inodes, especially once inode pinning is > removed. Hence I think this will result in a net reduction in memory > footprint for "marked-until-unmount" configurations as we won't pin > nearly as many inodes in cache... > It is a feasible design which has all the benefits that you listed. But it is a big change, just to get away from s_inodes (much easier to maintain a private list of pinned inodes). inotify (recursive tree watches for that matter) has been inefficient that way for a long time, and users now have less memory hogging solutions like fanotify mount and sb marks. granted, not unprivileged users, but still. So there needs to be a good justification to make this design change. One such justification would be to provide the infrastructure to the feature that Jan referred to as the "holy grail" in his LPC talk, namely, subtree watches. If we introduce code that looks up persistent "mark rules" on inode instantiation, then we could use it to "reconnect" inotify persistent inode marks (by ino/fid) or to establish automatic marks based on subtree/path based rules. audit code has something that resembles this and I suspect that this Landlock is doing something similar (?), but I didn't check. path based rules are always going to be elusive and tricky and Al is always going to hate them ;) Bottom line - good idea, not easy, requires allocating development resources. Thanks, Amir. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: lsm sb_delete hook, was Re: [PATCH 4/7] vfs: Convert sb->s_inodes iteration to super_iter_inodes() 2024-10-08 23:44 ` Dave Chinner 2024-10-09 6:10 ` Amir Goldstein @ 2024-10-09 14:18 ` Jan Kara 1 sibling, 0 replies; 34+ messages in thread From: Jan Kara @ 2024-10-09 14:18 UTC (permalink / raw) To: Dave Chinner Cc: Jan Kara, Amir Goldstein, Linus Torvalds, Christoph Hellwig, linux-fsdevel, linux-xfs, linux-bcachefs, kent.overstreet, Mickaël Salaün, Jann Horn, Serge Hallyn, Kees Cook, linux-security-module On Wed 09-10-24 10:44:12, Dave Chinner wrote: > On Tue, Oct 08, 2024 at 01:23:44PM +0200, Jan Kara wrote: > > On Tue 08-10-24 10:57:22, Amir Goldstein wrote: > > > On Tue, Oct 8, 2024 at 1:33 AM Dave Chinner <david@fromorbit.com> wrote: > > > > > > > > On Mon, Oct 07, 2024 at 01:37:19PM -0700, Linus Torvalds wrote: > > > > > On Thu, 3 Oct 2024 at 04:57, Jan Kara <jack@suse.cz> wrote: > > > > > > > > > > > > Fair enough. If we go with the iterator variant I've suggested to Dave in > > > > > > [1], we could combine the evict_inodes(), fsnotify_unmount_inodes() and > > > > > > Landlocks hook_sb_delete() into a single iteration relatively easily. But > > > > > > I'd wait with that convertion until this series lands. > > > > > > > > > > Honza, I looked at this a bit more, particularly with an eye of "what > > > > > happens if we just end up making the inode lifetimes subject to the > > > > > dentry lifetimes" as suggested by Dave elsewhere. > > > > > > > > .... > > > > > > > > > which makes the fsnotify_inode_delete() happen when the inode is > > > > > removed from the dentry. > > > > > > > > There may be other inode references being held that make > > > > the inode live longer than the dentry cache. When should the > > > > fsnotify marks be removed from the inode in that case? Do they need > > > > to remain until, e.g, writeback completes? > > > > > > > > > > fsnotify inode marks remain until explicitly removed or until sb > > > is unmounted (*), so other inode references are irrelevant to > > > inode mark removal. > > > > > > (*) fanotify has "evictable" inode marks, which do not hold inode > > > reference and go away on inode evict, but those mark evictions > > > do not generate any event (i.e. there is no FAN_UNMOUNT). > > > > Yes. Amir beat me with the response so let me just add that FS_UMOUNT event > > is for inotify which guarantees that either you get an event about somebody > > unlinking the inode (e.g. IN_DELETE_SELF) or event about filesystem being > > unmounted (IN_UMOUNT) if you place mark on some inode. I also don't see how > > we would maintain this behavior with what Linus proposes. > > Thanks. I didn't respond last night when I read Amir's decription > because I wanted to think it over. Knowing where the unmount event > requirement certainly helps. > > I am probably missing something important, but it really seems to me > that the object reference counting model is the back to > front. Currently the mark is being attached to the inode and then > the inode pinned by a reference count to make the mark attached > to the inode persistent until unmount. This then requires the inodes > to be swept by unmount because fsnotify has effectively leaked them > as it isn't tracking such inodes itself. > > [ Keep in mind that I'm not saying this was a bad or wrong thing to > do because the s_inodes list was there to be able to do this sort of > lazy cleanup. But now that we want to remove the s_inodes list if at > all possible, it is a problem we need to solve differently. ] Yes, agreed. > AFAICT, inotify does not appear to require the inode to send events > - it only requires access to the inode mark itself. Hence it does > not the inode in cache to generate IN_UNMOUNT events, it just > needs the mark itself to be findable at unmount. Do any of the > other backends that require unmount notifications that require > special access to the inode itself? No, I don't think unmount notification requires looking at the inode and it is inotify-specific thing as Amir wrote. We do require inode access when generating fanotify events (to open fd where event happened) but that gets handled separately by creating struct path when event happens and using it for dentry_open() later when reporting to userspace so that carries its own set on dentry + mnt references while the event is waiting in the queue. > If not, and the fsnotify sb info is tracking these persistent marks, > then we don't need to iterate inodes at unmount. This means we don't > need to pin inodes when they have marks attached, and so the > dependency on the s_inodes list goes away. > > With this inverted model, we need the first fsnotify event callout > after the inode is instantiated to look for a persistent mark for > the inode. We know how to do this efficiently - it's exactly the > same caching model we use for ACLs. On the first lookup, we check > the inode for ACL data and set the ACL pointer appropriately to > indicate that a lookup has been done and there are no ACLs > associated with the inode. Yes, I agree such scheme should be possible although a small snag I see is that we need to keep in fsnotify mark enough info so that it can be associated with an inode when it is read from the disk. And this info is filesystem specific with uncertain size for filesystems which use iget5(). So I suspect we'll need some support from individual filesystems which is always tedious. > At this point, the fsnotify inode marks can all be removed from the > inode when it is being evicted and there's no need for fsnotify to > pin inodes at all. > > > > > > Then at umount time, the dentry shrinking will deal with all live > > > > > dentries, and at most the fsnotify layer would send the FS_UNMOUNT to > > > > > just the root dentry inodes? > > > > > > > > I don't think even that is necessary, because > > > > shrink_dcache_for_umount() drops the sb->s_root dentry after > > > > trimming the dentry tree. Hence the dcache drop would cleanup all > > > > inode references, roots included. > > > > > > > > > Wouldn't that make things much cleaner, and remove at least *one* odd > > > > > use of the nasty s_inodes list? > > > > > > > > Yes, it would, but someone who knows exactly when the fsnotify > > > > marks can be removed needs to chime in here... > > > > So fsnotify needs a list of inodes for the superblock which have marks > > attached and for which we hold inode reference. We can keep it inside > > fsnotify code although it would practically mean another list_head for the > > inode for this list (probably in our fsnotify_connector structure which > > connects list of notification marks to the inode). > > I don't think that is necessary. We need to get rid of the inode > reference, not move where we track inode references. The persistent > object is the fsnotify mark, not the cached inode. It's the mark > that needs to be persistent, and that's what the fsnotify code > should be tracking. Right, I was not precise here. We don't need a list of tracked inodes. We are fine with a list of all marks for inodes on a superblock which we could crawl on umount. > The fsnotify marks are much smaller than inodes, and there going to > be fewer cached marks than inodes, especially once inode pinning is > removed. Hence I think this will result in a net reduction in memory > footprint for "marked-until-unmount" configurations as we won't pin > nearly as many inodes in cache... I agree. If fsnotify marks stop pinning inodes, we'll probably win much more memory by keeping inodes reclaimable than we loose by extra overhead of the mark tracking. > > > > > And I wonder if the quota code (which uses the s_inodes list > > > > > to enable quotas on already mounted filesystems) could for > > > > > all the same reasons just walk the dentry tree instead (and > > > > > remove_dquot_ref similarly could just remove it at > > > > > dentry_unlink_inode() time)? > > > > > > > > I don't think that will work because we have to be able to > > > > modify quota in evict() processing. This is especially true > > > > for unlinked inodes being evicted from cache, but also the > > > > dquots need to stay attached until writeback completes. > > > > > > > > Hence I don't think we can remove the quota refs from the > > > > inode before we call iput_final(), and so I think quotaoff (at > > > > least) still needs to iterate inodes... > > > > Yeah, I'm not sure how to get rid of the s_inodes use in quota > > code. One of the things we need s_inodes list for is during > > quotaoff on a mounted filesystem when we need to iterate all > > inodes which are referencing quota structures and free them. In > > theory we could keep a list of inodes referencing quota structures > > but that would require adding list_head to inode structure for > > filesystems that support quotas. > > I don't think that's quite true. Quota is not modular, so we can > lazily free quota objects even when quota is turned off. All we need > to ensure is that code checks whether quota is enabled, not for the > existence of quota objects attached to the inode. > > Hence quota-off simply turns off all the quota operations in memory, > and normal inode eviction cleans up the stale quota objects > naturally. Ho, hum, possibly yes. I need to think a bit more about this. > My main question is why the quota-on add_dquot_ref() pass is > required. AFAICT all of the filesystem operations that will modify > quota call dquot_initialize() directly to attach the required dquots > to the inode before the operation is started. If that's true, then > why does quota-on need to do this for all the inodes that are > already in cache? This is again for handling quotaon on already mounted filesystem. We initialize quotas for the inode when opening a file so if some files are already open when we do quotaon, we want to attach quota structures to these inodes. I think this was kind of important to limit mismatch between real usage and accounted usage when old style quotas were used e.g. for root filesystem but to be fair this code was there when I became quota maintainer in 1999 and I never dared to remove it :) > > Now for the sake of > > full context I'll also say that enabling / disabling quotas on a mounted > > filesystem is a legacy feature because it is quite easy that quota > > accounting goes wrong with it. So ext4 and f2fs support for quite a few > > years a mode where quota tracking is enabled on mount and disabled on > > unmount (if appropriate fs feature is enabled) and you can only enable / > > disable enforcement of quota limits during runtime. > > Sure, this is how XFS works, too. But I think this behaviour is > largely irrelevant because there are still filesystems out there > that do stuff the old way... > > > So I could see us > > deprecating this functionality altogether although jfs never adapted to > > this new way we do quotas so we'd have to deal with that somehow. But one > > way or another it would take a significant amount of time before we can > > completely remove this so it is out of question for this series. > > I'm not sure that matters, though it adds to the reasons why we > should be removing old, unmaintained filesystems from the tree > and old, outdated formats from maintained filesystems.... True. Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 34+ messages in thread
end of thread, other threads:[~2024-10-09 14:18 UTC | newest]
Thread overview: 34+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20241002014017.3801899-1-david@fromorbit.com>
[not found] ` <20241002014017.3801899-5-david@fromorbit.com>
2024-10-03 7:23 ` lsm sb_delete hook, was Re: [PATCH 4/7] vfs: Convert sb->s_inodes iteration to super_iter_inodes() Christoph Hellwig
2024-10-03 7:38 ` Christoph Hellwig
2024-10-03 11:57 ` Jan Kara
2024-10-03 12:11 ` Christoph Hellwig
2024-10-03 12:26 ` Jan Kara
2024-10-03 12:39 ` Christoph Hellwig
2024-10-03 12:56 ` Jan Kara
2024-10-03 13:04 ` Christoph Hellwig
2024-10-03 13:59 ` Dave Chinner
2024-10-03 16:17 ` Jan Kara
2024-10-04 0:46 ` Dave Chinner
2024-10-04 7:21 ` Christian Brauner
2024-10-04 12:14 ` Christoph Hellwig
2024-10-04 13:49 ` Jan Kara
2024-10-04 18:15 ` Paul Moore
2024-10-04 22:57 ` Dave Chinner
2024-10-05 15:21 ` Mickaël Salaün
2024-10-05 16:03 ` Mickaël Salaün
2024-10-05 16:03 ` Paul Moore
2024-10-07 20:37 ` Linus Torvalds
2024-10-07 23:33 ` Dave Chinner
2024-10-08 0:28 ` Linus Torvalds
2024-10-08 0:54 ` Linus Torvalds
2024-10-09 9:49 ` Jan Kara
2024-10-08 12:59 ` Mickaël Salaün
2024-10-09 0:21 ` Dave Chinner
2024-10-09 9:23 ` Mickaël Salaün
2024-10-08 8:57 ` Amir Goldstein
2024-10-08 11:23 ` Jan Kara
2024-10-08 12:16 ` Christian Brauner
2024-10-09 0:03 ` Dave Chinner
2024-10-08 23:44 ` Dave Chinner
2024-10-09 6:10 ` Amir Goldstein
2024-10-09 14:18 ` Jan Kara
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).