* [PATCH v4 1/3] ext4: don't increase iversion counter for ea_inodes @ 2022-08-24 16:03 Lukas Czerner 2022-08-24 16:03 ` [PATCH v4 2/3] fs: record I_DIRTY_TIME even if inode already has I_DIRTY_INODE Lukas Czerner ` (2 more replies) 0 siblings, 3 replies; 13+ messages in thread From: Lukas Czerner @ 2022-08-24 16:03 UTC (permalink / raw) To: linux-ext4 Cc: tytso, jlayton, jack, linux-fsdevel, ebiggers, david, Christian Brauner ea_inodes are using i_version for storing part of the reference count so we really need to leave it alone. The problem can be reproduced by xfstest ext4/026 when iversion is enabled. Fix it by not calling inode_inc_iversion() for EXT4_EA_INODE_FL inodes in ext4_mark_iloc_dirty(). Signed-off-by: Lukas Czerner <lczerner@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> --- v2, v3, v4: no change fs/ext4/inode.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 601214453c3a..2a220be34caa 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -5731,7 +5731,12 @@ int ext4_mark_iloc_dirty(handle_t *handle, } ext4_fc_track_inode(handle, inode); - if (IS_I_VERSION(inode)) + /* + * ea_inodes are using i_version for storing reference count, don't + * mess with it + */ + if (IS_I_VERSION(inode) && + !(EXT4_I(inode)->i_flags & EXT4_EA_INODE_FL)) inode_inc_iversion(inode); /* the do_update_inode consumes one bh->b_count */ -- 2.37.1 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v4 2/3] fs: record I_DIRTY_TIME even if inode already has I_DIRTY_INODE 2022-08-24 16:03 [PATCH v4 1/3] ext4: don't increase iversion counter for ea_inodes Lukas Czerner @ 2022-08-24 16:03 ` Lukas Czerner 2022-08-24 17:31 ` Jan Kara 2022-08-25 10:06 ` [PATCH v5] " Lukas Czerner 2022-08-24 16:03 ` [PATCH v4 3/3] ext4: unconditionally enable the i_version counter Lukas Czerner 2022-09-29 14:58 ` [PATCH v4 1/3] ext4: don't increase iversion counter for ea_inodes Theodore Ts'o 2 siblings, 2 replies; 13+ messages in thread From: Lukas Czerner @ 2022-08-24 16:03 UTC (permalink / raw) To: linux-ext4 Cc: tytso, jlayton, jack, linux-fsdevel, ebiggers, david, Christoph Hellwig Currently the I_DIRTY_TIME will never get set if the inode already has I_DIRTY_INODE with assumption that it supersedes I_DIRTY_TIME. That's true, however ext4 will only update the on-disk inode in ->dirty_inode(), not on actual writeback. As a result if the inode already has I_DIRTY_INODE state by the time we get to __mark_inode_dirty() only with I_DIRTY_TIME, the time was already filled into on-disk inode and will not get updated until the next I_DIRTY_INODE update, which might never come if we crash or get a power failure. The problem can be reproduced on ext4 by running xfstest generic/622 with -o iversion mount option. Fix it by allowing I_DIRTY_TIME to be set even if the inode already has I_DIRTY_INODE. Also make sure that the case is properly handled in writeback_single_inode() as well. Additionally changes in xfs_fs_dirty_inode() was made to accommodate for I_DIRTY_TIME in flag. Thanks Jan Kara for suggestions on how to make this work properly. Cc: Dave Chinner <david@fromorbit.com> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lukas Czerner <lczerner@redhat.com> Suggested-by: Jan Kara <jack@suse.cz> --- v2: Reworked according to suggestions from Jan v3: Update documentation, add comments, change flag to flags in xfs_fs_dirty_inode() v4: Update documentation, simplify condition in xfs_fs_dirty_inode() Documentation/filesystems/vfs.rst | 3 +++ fs/fs-writeback.c | 34 ++++++++++++++++++++----------- fs/xfs/xfs_super.c | 10 +++++++-- include/linux/fs.h | 9 ++++---- 4 files changed, 38 insertions(+), 18 deletions(-) diff --git a/Documentation/filesystems/vfs.rst b/Documentation/filesystems/vfs.rst index 6cd6953e175b..b2ef2449aed9 100644 --- a/Documentation/filesystems/vfs.rst +++ b/Documentation/filesystems/vfs.rst @@ -274,6 +274,9 @@ or bottom half). This is specifically for the inode itself being marked dirty, not its data. If the update needs to be persisted by fdatasync(), then I_DIRTY_DATASYNC will be set in the flags argument. + I_DIRTY_TIME will be set in the flags in case lazytime is enabled + and struct inode has times updated since the last ->dirty_inode + call. ``write_inode`` this method is called when the VFS needs to write an inode to diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c index 05221366a16d..638dbf143727 100644 --- a/fs/fs-writeback.c +++ b/fs/fs-writeback.c @@ -1718,9 +1718,14 @@ static int writeback_single_inode(struct inode *inode, */ if (!(inode->i_state & I_DIRTY_ALL)) inode_cgwb_move_to_attached(inode, wb); - else if (!(inode->i_state & I_SYNC_QUEUED) && - (inode->i_state & I_DIRTY)) - redirty_tail_locked(inode, wb); + else if (!(inode->i_state & I_SYNC_QUEUED)) { + if ((inode->i_state & I_DIRTY)) + redirty_tail_locked(inode, wb); + else if (inode->i_state & I_DIRTY_TIME) { + inode->dirtied_when = jiffies; + inode_io_list_move_locked(inode, wb, &wb->b_dirty_time); + } + } spin_unlock(&wb->list_lock); inode_sync_complete(inode); @@ -2369,6 +2374,17 @@ void __mark_inode_dirty(struct inode *inode, int flags) trace_writeback_mark_inode_dirty(inode, flags); if (flags & I_DIRTY_INODE) { + + /* Inode timestamp update will piggback on this dirtying */ + if (inode->i_state & I_DIRTY_TIME) { + spin_lock(&inode->i_lock); + if (inode->i_state & I_DIRTY_TIME) { + inode->i_state &= ~I_DIRTY_TIME; + flags |= I_DIRTY_TIME; + } + spin_unlock(&inode->i_lock); + } + /* * Notify the filesystem about the inode being dirtied, so that * (if needed) it can update on-disk fields and journal the @@ -2378,7 +2394,8 @@ void __mark_inode_dirty(struct inode *inode, int flags) */ trace_writeback_dirty_inode_start(inode, flags); if (sb->s_op->dirty_inode) - sb->s_op->dirty_inode(inode, flags & I_DIRTY_INODE); + sb->s_op->dirty_inode(inode, + flags & (I_DIRTY_INODE | I_DIRTY_TIME)); trace_writeback_dirty_inode(inode, flags); /* I_DIRTY_INODE supersedes I_DIRTY_TIME. */ @@ -2399,21 +2416,15 @@ void __mark_inode_dirty(struct inode *inode, int flags) */ smp_mb(); - if (((inode->i_state & flags) == flags) || - (dirtytime && (inode->i_state & I_DIRTY_INODE))) + if ((inode->i_state & flags) == flags) return; spin_lock(&inode->i_lock); - if (dirtytime && (inode->i_state & I_DIRTY_INODE)) - goto out_unlock_inode; if ((inode->i_state & flags) != flags) { const int was_dirty = inode->i_state & I_DIRTY; inode_attach_wb(inode, NULL); - /* I_DIRTY_INODE supersedes I_DIRTY_TIME. */ - if (flags & I_DIRTY_INODE) - inode->i_state &= ~I_DIRTY_TIME; inode->i_state |= flags; /* @@ -2486,7 +2497,6 @@ void __mark_inode_dirty(struct inode *inode, int flags) out_unlock: if (wb) spin_unlock(&wb->list_lock); -out_unlock_inode: spin_unlock(&inode->i_lock); } EXPORT_SYMBOL(__mark_inode_dirty); diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c index 9ac59814bbb6..f029c6702dda 100644 --- a/fs/xfs/xfs_super.c +++ b/fs/xfs/xfs_super.c @@ -653,7 +653,7 @@ xfs_fs_destroy_inode( static void xfs_fs_dirty_inode( struct inode *inode, - int flag) + int flags) { struct xfs_inode *ip = XFS_I(inode); struct xfs_mount *mp = ip->i_mount; @@ -661,7 +661,13 @@ xfs_fs_dirty_inode( if (!(inode->i_sb->s_flags & SB_LAZYTIME)) return; - if (flag != I_DIRTY_SYNC || !(inode->i_state & I_DIRTY_TIME)) + + /* + * Only do the timestamp update if the inode is dirty (I_DIRTY_SYNC) + * and has dirty timestamp (I_DIRTY_TIME). I_DIRTY_TIME can be passed + * in flags possibly together with I_DIRTY_SYNC. + */ + if ((flags & ~I_DIRTY_TIME) != I_DIRTY_SYNC || !(flags & I_DIRTY_TIME)) return; if (xfs_trans_alloc(mp, &M_RES(mp)->tr_fsyncts, 0, 0, 0, &tp)) diff --git a/include/linux/fs.h b/include/linux/fs.h index 9eced4cc286e..56a4b4b02477 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2371,13 +2371,14 @@ static inline void kiocb_clone(struct kiocb *kiocb, struct kiocb *kiocb_src, * don't have to write inode on fdatasync() when only * e.g. the timestamps have changed. * I_DIRTY_PAGES Inode has dirty pages. Inode itself may be clean. - * I_DIRTY_TIME The inode itself only has dirty timestamps, and the + * I_DIRTY_TIME The inode itself has dirty timestamps, and the * lazytime mount option is enabled. We keep track of this * separately from I_DIRTY_SYNC in order to implement * lazytime. This gets cleared if I_DIRTY_INODE - * (I_DIRTY_SYNC and/or I_DIRTY_DATASYNC) gets set. I.e. - * either I_DIRTY_TIME *or* I_DIRTY_INODE can be set in - * i_state, but not both. I_DIRTY_PAGES may still be set. + * (I_DIRTY_SYNC and/or I_DIRTY_DATASYNC) gets set. But + * I_DIRTY_TIME can still be set if I_DIRTY_SYNC is already + * in place because writeback might already be in progress + * and we don't want to lose the time update * I_NEW Serves as both a mutex and completion notification. * New inodes set I_NEW. If two processes both create * the same inode, one of them will release its inode and -- 2.37.1 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v4 2/3] fs: record I_DIRTY_TIME even if inode already has I_DIRTY_INODE 2022-08-24 16:03 ` [PATCH v4 2/3] fs: record I_DIRTY_TIME even if inode already has I_DIRTY_INODE Lukas Czerner @ 2022-08-24 17:31 ` Jan Kara 2022-08-25 10:06 ` [PATCH v5] " Lukas Czerner 1 sibling, 0 replies; 13+ messages in thread From: Jan Kara @ 2022-08-24 17:31 UTC (permalink / raw) To: Lukas Czerner Cc: linux-ext4, tytso, jlayton, jack, linux-fsdevel, ebiggers, david, Christoph Hellwig On Wed 24-08-22 18:03:48, Lukas Czerner wrote: > Currently the I_DIRTY_TIME will never get set if the inode already has > I_DIRTY_INODE with assumption that it supersedes I_DIRTY_TIME. That's > true, however ext4 will only update the on-disk inode in > ->dirty_inode(), not on actual writeback. As a result if the inode > already has I_DIRTY_INODE state by the time we get to > __mark_inode_dirty() only with I_DIRTY_TIME, the time was already filled > into on-disk inode and will not get updated until the next I_DIRTY_INODE > update, which might never come if we crash or get a power failure. > > The problem can be reproduced on ext4 by running xfstest generic/622 > with -o iversion mount option. > > Fix it by allowing I_DIRTY_TIME to be set even if the inode already has > I_DIRTY_INODE. Also make sure that the case is properly handled in > writeback_single_inode() as well. Additionally changes in > xfs_fs_dirty_inode() was made to accommodate for I_DIRTY_TIME in flag. > > Thanks Jan Kara for suggestions on how to make this work properly. > > Cc: Dave Chinner <david@fromorbit.com> > Cc: Christoph Hellwig <hch@infradead.org> > Signed-off-by: Lukas Czerner <lczerner@redhat.com> > Suggested-by: Jan Kara <jack@suse.cz> Looks good to me. Feel free to add: Reviewed-by: Jan Kara <jack@suse.cz> Just two nits below: > @@ -2369,6 +2374,17 @@ void __mark_inode_dirty(struct inode *inode, int flags) > trace_writeback_mark_inode_dirty(inode, flags); > > if (flags & I_DIRTY_INODE) { > + Pointless empty line here. > + /* Inode timestamp update will piggback on this dirtying */ Maybe expand this comment to: /* * Inode timestamp update will piggback on this dirtying. * We tell ->dirty_inode callback that timestamps need to * be updated by setting I_DIRTY_TIME in flags. */ > + if (inode->i_state & I_DIRTY_TIME) { > + spin_lock(&inode->i_lock); > + if (inode->i_state & I_DIRTY_TIME) { > + inode->i_state &= ~I_DIRTY_TIME; > + flags |= I_DIRTY_TIME; > + } > + spin_unlock(&inode->i_lock); > + } > + Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v5] fs: record I_DIRTY_TIME even if inode already has I_DIRTY_INODE 2022-08-24 16:03 ` [PATCH v4 2/3] fs: record I_DIRTY_TIME even if inode already has I_DIRTY_INODE Lukas Czerner 2022-08-24 17:31 ` Jan Kara @ 2022-08-25 10:06 ` Lukas Czerner 2022-09-29 14:58 ` Theodore Ts'o 1 sibling, 1 reply; 13+ messages in thread From: Lukas Czerner @ 2022-08-25 10:06 UTC (permalink / raw) To: linux-ext4 Cc: tytso, jlayton, jack, linux-fsdevel, ebiggers, david, Christoph Hellwig Currently the I_DIRTY_TIME will never get set if the inode already has I_DIRTY_INODE with assumption that it supersedes I_DIRTY_TIME. That's true, however ext4 will only update the on-disk inode in ->dirty_inode(), not on actual writeback. As a result if the inode already has I_DIRTY_INODE state by the time we get to __mark_inode_dirty() only with I_DIRTY_TIME, the time was already filled into on-disk inode and will not get updated until the next I_DIRTY_INODE update, which might never come if we crash or get a power failure. The problem can be reproduced on ext4 by running xfstest generic/622 with -o iversion mount option. Fix it by allowing I_DIRTY_TIME to be set even if the inode already has I_DIRTY_INODE. Also make sure that the case is properly handled in writeback_single_inode() as well. Additionally changes in xfs_fs_dirty_inode() was made to accommodate for I_DIRTY_TIME in flag. Thanks Jan Kara for suggestions on how to make this work properly. Cc: Dave Chinner <david@fromorbit.com> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lukas Czerner <lczerner@redhat.com> Suggested-by: Jan Kara <jack@suse.cz> Reviewed-by: Jan Kara <jack@suse.cz> --- v2: Reworked according to suggestions from Jan v3: Update documentation, add comments, change flag to flags in xfs_fs_dirty_inode() v4: Update documentation, simplify condition in xfs_fs_dirty_inode() v5: Update comment for condition in __mark_inode_dirty() Documentation/filesystems/vfs.rst | 3 +++ fs/fs-writeback.c | 37 +++++++++++++++++++++---------- fs/xfs/xfs_super.c | 10 +++++++-- include/linux/fs.h | 9 ++++---- 4 files changed, 41 insertions(+), 18 deletions(-) diff --git a/Documentation/filesystems/vfs.rst b/Documentation/filesystems/vfs.rst index 6cd6953e175b..b2ef2449aed9 100644 --- a/Documentation/filesystems/vfs.rst +++ b/Documentation/filesystems/vfs.rst @@ -274,6 +274,9 @@ or bottom half). This is specifically for the inode itself being marked dirty, not its data. If the update needs to be persisted by fdatasync(), then I_DIRTY_DATASYNC will be set in the flags argument. + I_DIRTY_TIME will be set in the flags in case lazytime is enabled + and struct inode has times updated since the last ->dirty_inode + call. ``write_inode`` this method is called when the VFS needs to write an inode to diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c index 05221366a16d..45860591d51f 100644 --- a/fs/fs-writeback.c +++ b/fs/fs-writeback.c @@ -1718,9 +1718,14 @@ static int writeback_single_inode(struct inode *inode, */ if (!(inode->i_state & I_DIRTY_ALL)) inode_cgwb_move_to_attached(inode, wb); - else if (!(inode->i_state & I_SYNC_QUEUED) && - (inode->i_state & I_DIRTY)) - redirty_tail_locked(inode, wb); + else if (!(inode->i_state & I_SYNC_QUEUED)) { + if ((inode->i_state & I_DIRTY)) + redirty_tail_locked(inode, wb); + else if (inode->i_state & I_DIRTY_TIME) { + inode->dirtied_when = jiffies; + inode_io_list_move_locked(inode, wb, &wb->b_dirty_time); + } + } spin_unlock(&wb->list_lock); inode_sync_complete(inode); @@ -2369,6 +2374,20 @@ void __mark_inode_dirty(struct inode *inode, int flags) trace_writeback_mark_inode_dirty(inode, flags); if (flags & I_DIRTY_INODE) { + /* + * Inode timestamp update will piggback on this dirtying. + * We tell ->dirty_inode callback that timestamps need to + * be updated by setting I_DIRTY_TIME in flags. + */ + if (inode->i_state & I_DIRTY_TIME) { + spin_lock(&inode->i_lock); + if (inode->i_state & I_DIRTY_TIME) { + inode->i_state &= ~I_DIRTY_TIME; + flags |= I_DIRTY_TIME; + } + spin_unlock(&inode->i_lock); + } + /* * Notify the filesystem about the inode being dirtied, so that * (if needed) it can update on-disk fields and journal the @@ -2378,7 +2397,8 @@ void __mark_inode_dirty(struct inode *inode, int flags) */ trace_writeback_dirty_inode_start(inode, flags); if (sb->s_op->dirty_inode) - sb->s_op->dirty_inode(inode, flags & I_DIRTY_INODE); + sb->s_op->dirty_inode(inode, + flags & (I_DIRTY_INODE | I_DIRTY_TIME)); trace_writeback_dirty_inode(inode, flags); /* I_DIRTY_INODE supersedes I_DIRTY_TIME. */ @@ -2399,21 +2419,15 @@ void __mark_inode_dirty(struct inode *inode, int flags) */ smp_mb(); - if (((inode->i_state & flags) == flags) || - (dirtytime && (inode->i_state & I_DIRTY_INODE))) + if ((inode->i_state & flags) == flags) return; spin_lock(&inode->i_lock); - if (dirtytime && (inode->i_state & I_DIRTY_INODE)) - goto out_unlock_inode; if ((inode->i_state & flags) != flags) { const int was_dirty = inode->i_state & I_DIRTY; inode_attach_wb(inode, NULL); - /* I_DIRTY_INODE supersedes I_DIRTY_TIME. */ - if (flags & I_DIRTY_INODE) - inode->i_state &= ~I_DIRTY_TIME; inode->i_state |= flags; /* @@ -2486,7 +2500,6 @@ void __mark_inode_dirty(struct inode *inode, int flags) out_unlock: if (wb) spin_unlock(&wb->list_lock); -out_unlock_inode: spin_unlock(&inode->i_lock); } EXPORT_SYMBOL(__mark_inode_dirty); diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c index 9ac59814bbb6..f029c6702dda 100644 --- a/fs/xfs/xfs_super.c +++ b/fs/xfs/xfs_super.c @@ -653,7 +653,7 @@ xfs_fs_destroy_inode( static void xfs_fs_dirty_inode( struct inode *inode, - int flag) + int flags) { struct xfs_inode *ip = XFS_I(inode); struct xfs_mount *mp = ip->i_mount; @@ -661,7 +661,13 @@ xfs_fs_dirty_inode( if (!(inode->i_sb->s_flags & SB_LAZYTIME)) return; - if (flag != I_DIRTY_SYNC || !(inode->i_state & I_DIRTY_TIME)) + + /* + * Only do the timestamp update if the inode is dirty (I_DIRTY_SYNC) + * and has dirty timestamp (I_DIRTY_TIME). I_DIRTY_TIME can be passed + * in flags possibly together with I_DIRTY_SYNC. + */ + if ((flags & ~I_DIRTY_TIME) != I_DIRTY_SYNC || !(flags & I_DIRTY_TIME)) return; if (xfs_trans_alloc(mp, &M_RES(mp)->tr_fsyncts, 0, 0, 0, &tp)) diff --git a/include/linux/fs.h b/include/linux/fs.h index 9eced4cc286e..56a4b4b02477 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2371,13 +2371,14 @@ static inline void kiocb_clone(struct kiocb *kiocb, struct kiocb *kiocb_src, * don't have to write inode on fdatasync() when only * e.g. the timestamps have changed. * I_DIRTY_PAGES Inode has dirty pages. Inode itself may be clean. - * I_DIRTY_TIME The inode itself only has dirty timestamps, and the + * I_DIRTY_TIME The inode itself has dirty timestamps, and the * lazytime mount option is enabled. We keep track of this * separately from I_DIRTY_SYNC in order to implement * lazytime. This gets cleared if I_DIRTY_INODE - * (I_DIRTY_SYNC and/or I_DIRTY_DATASYNC) gets set. I.e. - * either I_DIRTY_TIME *or* I_DIRTY_INODE can be set in - * i_state, but not both. I_DIRTY_PAGES may still be set. + * (I_DIRTY_SYNC and/or I_DIRTY_DATASYNC) gets set. But + * I_DIRTY_TIME can still be set if I_DIRTY_SYNC is already + * in place because writeback might already be in progress + * and we don't want to lose the time update * I_NEW Serves as both a mutex and completion notification. * New inodes set I_NEW. If two processes both create * the same inode, one of them will release its inode and -- 2.37.1 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v5] fs: record I_DIRTY_TIME even if inode already has I_DIRTY_INODE 2022-08-25 10:06 ` [PATCH v5] " Lukas Czerner @ 2022-09-29 14:58 ` Theodore Ts'o 0 siblings, 0 replies; 13+ messages in thread From: Theodore Ts'o @ 2022-09-29 14:58 UTC (permalink / raw) To: linux-ext4, lczerner Cc: Theodore Ts'o, jlayton, ebiggers, jack, david, linux-fsdevel, hch On Thu, 25 Aug 2022 12:06:57 +0200, Lukas Czerner wrote: > Currently the I_DIRTY_TIME will never get set if the inode already has > I_DIRTY_INODE with assumption that it supersedes I_DIRTY_TIME. That's > true, however ext4 will only update the on-disk inode in > ->dirty_inode(), not on actual writeback. As a result if the inode > already has I_DIRTY_INODE state by the time we get to > __mark_inode_dirty() only with I_DIRTY_TIME, the time was already filled > into on-disk inode and will not get updated until the next I_DIRTY_INODE > update, which might never come if we crash or get a power failure. > > [...] Applied, thanks! [1/1] fs: record I_DIRTY_TIME even if inode already has I_DIRTY_INODE commit: 625e1e67b66245b93ccae868cd4a950d257de003 Best regards, -- Theodore Ts'o <tytso@mit.edu> ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v4 3/3] ext4: unconditionally enable the i_version counter 2022-08-24 16:03 [PATCH v4 1/3] ext4: don't increase iversion counter for ea_inodes Lukas Czerner 2022-08-24 16:03 ` [PATCH v4 2/3] fs: record I_DIRTY_TIME even if inode already has I_DIRTY_INODE Lukas Czerner @ 2022-08-24 16:03 ` Lukas Czerner 2022-08-26 16:11 ` Jeff Layton 2022-09-29 14:58 ` [PATCH v4 1/3] ext4: don't increase iversion counter for ea_inodes Theodore Ts'o 2 siblings, 1 reply; 13+ messages in thread From: Lukas Czerner @ 2022-08-24 16:03 UTC (permalink / raw) To: linux-ext4 Cc: tytso, jlayton, jack, linux-fsdevel, ebiggers, david, Benjamin Coddington, Christoph Hellwig, Darrick J . Wong, Christian Brauner From: Jeff Layton <jlayton@kernel.org> The original i_version implementation was pretty expensive, requiring a log flush on every change. Because of this, it was gated behind a mount option (implemented via the MS_I_VERSION mountoption flag). Commit ae5e165d855d (fs: new API for handling inode->i_version) made the i_version flag much less expensive, so there is no longer a performance penalty from enabling it. xfs and btrfs already enable it unconditionally when the on-disk format can support it. Have ext4 ignore the SB_I_VERSION flag, and just enable it unconditionally. While we're in here, remove the handling of Opt_i_version as well, since we're almost to 5.20 anyway. Ideally, we'd couple this change with a way to disable the i_version counter (just in case), but the way the iversion mount option was implemented makes that difficult to do. We'd need to add a new mount option altogether or do something with tune2fs. That's probably best left to later patches if it turns out to be needed. [ Removed leftover bits of i_version from ext4_apply_options() since it now can't ever be set in ctx->mask_s_flags -- lczerner ] Cc: Dave Chinner <david@fromorbit.com> Cc: Benjamin Coddington <bcodding@redhat.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Lukas Czerner <lczerner@redhat.com> Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> --- v3: Removed leftover bits of i_version from ext4_apply_options v4: no change fs/ext4/inode.c | 5 ++--- fs/ext4/super.c | 21 ++++----------------- 2 files changed, 6 insertions(+), 20 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 2a220be34caa..c77d40f05763 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -5425,7 +5425,7 @@ int ext4_setattr(struct user_namespace *mnt_userns, struct dentry *dentry, return -EINVAL; } - if (IS_I_VERSION(inode) && attr->ia_size != inode->i_size) + if (attr->ia_size != inode->i_size) inode_inc_iversion(inode); if (shrink) { @@ -5735,8 +5735,7 @@ int ext4_mark_iloc_dirty(handle_t *handle, * ea_inodes are using i_version for storing reference count, don't * mess with it */ - if (IS_I_VERSION(inode) && - !(EXT4_I(inode)->i_flags & EXT4_EA_INODE_FL)) + if (!(EXT4_I(inode)->i_flags & EXT4_EA_INODE_FL)) inode_inc_iversion(inode); /* the do_update_inode consumes one bh->b_count */ diff --git a/fs/ext4/super.c b/fs/ext4/super.c index 9a66abcca1a8..1c953f6d400e 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -1585,7 +1585,7 @@ enum { Opt_inlinecrypt, Opt_usrjquota, Opt_grpjquota, Opt_quota, Opt_noquota, Opt_barrier, Opt_nobarrier, Opt_err, - Opt_usrquota, Opt_grpquota, Opt_prjquota, Opt_i_version, + Opt_usrquota, Opt_grpquota, Opt_prjquota, Opt_dax, Opt_dax_always, Opt_dax_inode, Opt_dax_never, Opt_stripe, Opt_delalloc, Opt_nodelalloc, Opt_warn_on_error, Opt_nowarn_on_error, Opt_mblk_io_submit, Opt_debug_want_extra_isize, @@ -1694,7 +1694,6 @@ static const struct fs_parameter_spec ext4_param_specs[] = { fsparam_flag ("barrier", Opt_barrier), fsparam_u32 ("barrier", Opt_barrier), fsparam_flag ("nobarrier", Opt_nobarrier), - fsparam_flag ("i_version", Opt_i_version), fsparam_flag ("dax", Opt_dax), fsparam_enum ("dax", Opt_dax_type, ext4_param_dax), fsparam_u32 ("stripe", Opt_stripe), @@ -2140,11 +2139,6 @@ static int ext4_parse_param(struct fs_context *fc, struct fs_parameter *param) case Opt_abort: ctx_set_mount_flag(ctx, EXT4_MF_FS_ABORTED); return 0; - case Opt_i_version: - ext4_msg(NULL, KERN_WARNING, deprecated_msg, param->key, "5.20"); - ext4_msg(NULL, KERN_WARNING, "Use iversion instead\n"); - ctx_set_flags(ctx, SB_I_VERSION); - return 0; case Opt_inlinecrypt: #ifdef CONFIG_FS_ENCRYPTION_INLINE_CRYPT ctx_set_flags(ctx, SB_INLINECRYPT); @@ -2814,14 +2808,6 @@ static void ext4_apply_options(struct fs_context *fc, struct super_block *sb) sb->s_flags &= ~ctx->mask_s_flags; sb->s_flags |= ctx->vals_s_flags; - /* - * i_version differs from common mount option iversion so we have - * to let vfs know that it was set, otherwise it would get cleared - * on remount - */ - if (ctx->mask_s_flags & SB_I_VERSION) - fc->sb_flags |= SB_I_VERSION; - #define APPLY(X) ({ if (ctx->spec & EXT4_SPEC_##X) sbi->X = ctx->X; }) APPLY(s_commit_interval); APPLY(s_stripe); @@ -2970,8 +2956,6 @@ static int _ext4_show_options(struct seq_file *seq, struct super_block *sb, SEQ_OPTS_PRINT("min_batch_time=%u", sbi->s_min_batch_time); if (nodefs || sbi->s_max_batch_time != EXT4_DEF_MAX_BATCH_TIME) SEQ_OPTS_PRINT("max_batch_time=%u", sbi->s_max_batch_time); - if (sb->s_flags & SB_I_VERSION) - SEQ_OPTS_PUTS("i_version"); if (nodefs || sbi->s_stripe) SEQ_OPTS_PRINT("stripe=%lu", sbi->s_stripe); if (nodefs || EXT4_MOUNT_DATA_FLAGS & @@ -4640,6 +4624,9 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb) sb->s_flags = (sb->s_flags & ~SB_POSIXACL) | (test_opt(sb, POSIX_ACL) ? SB_POSIXACL : 0); + /* i_version is always enabled now */ + sb->s_flags |= SB_I_VERSION; + if (le32_to_cpu(es->s_rev_level) == EXT4_GOOD_OLD_REV && (ext4_has_compat_features(sb) || ext4_has_ro_compat_features(sb) || -- 2.37.1 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v4 3/3] ext4: unconditionally enable the i_version counter 2022-08-24 16:03 ` [PATCH v4 3/3] ext4: unconditionally enable the i_version counter Lukas Czerner @ 2022-08-26 16:11 ` Jeff Layton 2022-08-29 8:17 ` Lukas Czerner 0 siblings, 1 reply; 13+ messages in thread From: Jeff Layton @ 2022-08-26 16:11 UTC (permalink / raw) To: Lukas Czerner, linux-ext4 Cc: tytso, jack, linux-fsdevel, ebiggers, david, Benjamin Coddington, Christoph Hellwig, Darrick J . Wong, Christian Brauner On Wed, 2022-08-24 at 18:03 +0200, Lukas Czerner wrote: > From: Jeff Layton <jlayton@kernel.org> > > The original i_version implementation was pretty expensive, requiring a > log flush on every change. Because of this, it was gated behind a mount > option (implemented via the MS_I_VERSION mountoption flag). > > Commit ae5e165d855d (fs: new API for handling inode->i_version) made the > i_version flag much less expensive, so there is no longer a performance > penalty from enabling it. xfs and btrfs already enable it > unconditionally when the on-disk format can support it. > > Have ext4 ignore the SB_I_VERSION flag, and just enable it > unconditionally. While we're in here, remove the handling of > Opt_i_version as well, since we're almost to 5.20 anyway. > > Ideally, we'd couple this change with a way to disable the i_version > counter (just in case), but the way the iversion mount option was > implemented makes that difficult to do. We'd need to add a new mount > option altogether or do something with tune2fs. That's probably best > left to later patches if it turns out to be needed. > > [ Removed leftover bits of i_version from ext4_apply_options() since it > now can't ever be set in ctx->mask_s_flags -- lczerner ] > > Cc: Dave Chinner <david@fromorbit.com> > Cc: Benjamin Coddington <bcodding@redhat.com> > Cc: Christoph Hellwig <hch@infradead.org> > Cc: Darrick J. Wong <djwong@kernel.org> > Signed-off-by: Jeff Layton <jlayton@kernel.org> > Signed-off-by: Lukas Czerner <lczerner@redhat.com> > Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> > Reviewed-by: Jan Kara <jack@suse.cz> > --- > v3: Removed leftover bits of i_version from ext4_apply_options > v4: no change > > fs/ext4/inode.c | 5 ++--- > fs/ext4/super.c | 21 ++++----------------- > 2 files changed, 6 insertions(+), 20 deletions(-) > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > index 2a220be34caa..c77d40f05763 100644 > --- a/fs/ext4/inode.c > +++ b/fs/ext4/inode.c > @@ -5425,7 +5425,7 @@ int ext4_setattr(struct user_namespace *mnt_userns, struct dentry *dentry, > return -EINVAL; > } > > - if (IS_I_VERSION(inode) && attr->ia_size != inode->i_size) > + if (attr->ia_size != inode->i_size) > inode_inc_iversion(inode); > > if (shrink) { > @@ -5735,8 +5735,7 @@ int ext4_mark_iloc_dirty(handle_t *handle, > * ea_inodes are using i_version for storing reference count, don't > * mess with it > */ > - if (IS_I_VERSION(inode) && > - !(EXT4_I(inode)->i_flags & EXT4_EA_INODE_FL)) > + if (!(EXT4_I(inode)->i_flags & EXT4_EA_INODE_FL)) > inode_inc_iversion(inode); > > /* the do_update_inode consumes one bh->b_count */ > diff --git a/fs/ext4/super.c b/fs/ext4/super.c > index 9a66abcca1a8..1c953f6d400e 100644 > --- a/fs/ext4/super.c > +++ b/fs/ext4/super.c > @@ -1585,7 +1585,7 @@ enum { > Opt_inlinecrypt, > Opt_usrjquota, Opt_grpjquota, Opt_quota, > Opt_noquota, Opt_barrier, Opt_nobarrier, Opt_err, > - Opt_usrquota, Opt_grpquota, Opt_prjquota, Opt_i_version, > + Opt_usrquota, Opt_grpquota, Opt_prjquota, > Opt_dax, Opt_dax_always, Opt_dax_inode, Opt_dax_never, > Opt_stripe, Opt_delalloc, Opt_nodelalloc, Opt_warn_on_error, > Opt_nowarn_on_error, Opt_mblk_io_submit, Opt_debug_want_extra_isize, > @@ -1694,7 +1694,6 @@ static const struct fs_parameter_spec ext4_param_specs[] = { > fsparam_flag ("barrier", Opt_barrier), > fsparam_u32 ("barrier", Opt_barrier), > fsparam_flag ("nobarrier", Opt_nobarrier), > - fsparam_flag ("i_version", Opt_i_version), > fsparam_flag ("dax", Opt_dax), > fsparam_enum ("dax", Opt_dax_type, ext4_param_dax), > fsparam_u32 ("stripe", Opt_stripe), > @@ -2140,11 +2139,6 @@ static int ext4_parse_param(struct fs_context *fc, struct fs_parameter *param) > case Opt_abort: > ctx_set_mount_flag(ctx, EXT4_MF_FS_ABORTED); > return 0; > - case Opt_i_version: > - ext4_msg(NULL, KERN_WARNING, deprecated_msg, param->key, "5.20"); > - ext4_msg(NULL, KERN_WARNING, "Use iversion instead\n"); > - ctx_set_flags(ctx, SB_I_VERSION); > - return 0; > case Opt_inlinecrypt: > #ifdef CONFIG_FS_ENCRYPTION_INLINE_CRYPT > ctx_set_flags(ctx, SB_INLINECRYPT); > @@ -2814,14 +2808,6 @@ static void ext4_apply_options(struct fs_context *fc, struct super_block *sb) > sb->s_flags &= ~ctx->mask_s_flags; > sb->s_flags |= ctx->vals_s_flags; > > - /* > - * i_version differs from common mount option iversion so we have > - * to let vfs know that it was set, otherwise it would get cleared > - * on remount > - */ > - if (ctx->mask_s_flags & SB_I_VERSION) > - fc->sb_flags |= SB_I_VERSION; > - > #define APPLY(X) ({ if (ctx->spec & EXT4_SPEC_##X) sbi->X = ctx->X; }) > APPLY(s_commit_interval); > APPLY(s_stripe); > @@ -2970,8 +2956,6 @@ static int _ext4_show_options(struct seq_file *seq, struct super_block *sb, > SEQ_OPTS_PRINT("min_batch_time=%u", sbi->s_min_batch_time); > if (nodefs || sbi->s_max_batch_time != EXT4_DEF_MAX_BATCH_TIME) > SEQ_OPTS_PRINT("max_batch_time=%u", sbi->s_max_batch_time); > - if (sb->s_flags & SB_I_VERSION) > - SEQ_OPTS_PUTS("i_version"); > if (nodefs || sbi->s_stripe) > SEQ_OPTS_PRINT("stripe=%lu", sbi->s_stripe); > if (nodefs || EXT4_MOUNT_DATA_FLAGS & > @@ -4640,6 +4624,9 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb) > sb->s_flags = (sb->s_flags & ~SB_POSIXACL) | > (test_opt(sb, POSIX_ACL) ? SB_POSIXACL : 0); > > + /* i_version is always enabled now */ > + sb->s_flags |= SB_I_VERSION; > + > if (le32_to_cpu(es->s_rev_level) == EXT4_GOOD_OLD_REV && > (ext4_has_compat_features(sb) || > ext4_has_ro_compat_features(sb) || Hi Lukas, I know I had originally asked you to shepherd this patch into mainline, but I think it may be better to wait on it for now. Since I asked that, we've since found out that ext4 is bumping the i_version counter on atime updates. It'd be best to get that fixed before we turn this on unconditionally, since it could cause a performance regression in some cases. I'll plan to pick this back up for my latest i_version series if that sounds ok to you. Sorry for the back and forth, and thanks again! Cheers, -- Jeff Layton <jlayton@kernel.org> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v4 3/3] ext4: unconditionally enable the i_version counter 2022-08-26 16:11 ` Jeff Layton @ 2022-08-29 8:17 ` Lukas Czerner 2022-08-29 10:16 ` Jeff Layton 0 siblings, 1 reply; 13+ messages in thread From: Lukas Czerner @ 2022-08-29 8:17 UTC (permalink / raw) To: Jeff Layton Cc: linux-ext4, tytso, jack, linux-fsdevel, ebiggers, david, Benjamin Coddington, Christoph Hellwig, Darrick J . Wong, Christian Brauner On Fri, Aug 26, 2022 at 12:11:23PM -0400, Jeff Layton wrote: > On Wed, 2022-08-24 at 18:03 +0200, Lukas Czerner wrote: > > From: Jeff Layton <jlayton@kernel.org> > > > > The original i_version implementation was pretty expensive, requiring a > > log flush on every change. Because of this, it was gated behind a mount > > option (implemented via the MS_I_VERSION mountoption flag). > > > > Commit ae5e165d855d (fs: new API for handling inode->i_version) made the > > i_version flag much less expensive, so there is no longer a performance > > penalty from enabling it. xfs and btrfs already enable it > > unconditionally when the on-disk format can support it. > > > > Have ext4 ignore the SB_I_VERSION flag, and just enable it > > unconditionally. While we're in here, remove the handling of > > Opt_i_version as well, since we're almost to 5.20 anyway. > > > > Ideally, we'd couple this change with a way to disable the i_version > > counter (just in case), but the way the iversion mount option was > > implemented makes that difficult to do. We'd need to add a new mount > > option altogether or do something with tune2fs. That's probably best > > left to later patches if it turns out to be needed. > > > > [ Removed leftover bits of i_version from ext4_apply_options() since it > > now can't ever be set in ctx->mask_s_flags -- lczerner ] > > > > Cc: Dave Chinner <david@fromorbit.com> > > Cc: Benjamin Coddington <bcodding@redhat.com> > > Cc: Christoph Hellwig <hch@infradead.org> > > Cc: Darrick J. Wong <djwong@kernel.org> > > Signed-off-by: Jeff Layton <jlayton@kernel.org> > > Signed-off-by: Lukas Czerner <lczerner@redhat.com> > > Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> > > Reviewed-by: Jan Kara <jack@suse.cz> > > --- > > v3: Removed leftover bits of i_version from ext4_apply_options > > v4: no change > > > > fs/ext4/inode.c | 5 ++--- > > fs/ext4/super.c | 21 ++++----------------- > > 2 files changed, 6 insertions(+), 20 deletions(-) > > > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > > index 2a220be34caa..c77d40f05763 100644 > > --- a/fs/ext4/inode.c > > +++ b/fs/ext4/inode.c > > @@ -5425,7 +5425,7 @@ int ext4_setattr(struct user_namespace *mnt_userns, struct dentry *dentry, > > return -EINVAL; > > } > > > > - if (IS_I_VERSION(inode) && attr->ia_size != inode->i_size) > > + if (attr->ia_size != inode->i_size) > > inode_inc_iversion(inode); > > > > if (shrink) { > > @@ -5735,8 +5735,7 @@ int ext4_mark_iloc_dirty(handle_t *handle, > > * ea_inodes are using i_version for storing reference count, don't > > * mess with it > > */ > > - if (IS_I_VERSION(inode) && > > - !(EXT4_I(inode)->i_flags & EXT4_EA_INODE_FL)) > > + if (!(EXT4_I(inode)->i_flags & EXT4_EA_INODE_FL)) > > inode_inc_iversion(inode); > > > > /* the do_update_inode consumes one bh->b_count */ > > diff --git a/fs/ext4/super.c b/fs/ext4/super.c > > index 9a66abcca1a8..1c953f6d400e 100644 > > --- a/fs/ext4/super.c > > +++ b/fs/ext4/super.c > > @@ -1585,7 +1585,7 @@ enum { > > Opt_inlinecrypt, > > Opt_usrjquota, Opt_grpjquota, Opt_quota, > > Opt_noquota, Opt_barrier, Opt_nobarrier, Opt_err, > > - Opt_usrquota, Opt_grpquota, Opt_prjquota, Opt_i_version, > > + Opt_usrquota, Opt_grpquota, Opt_prjquota, > > Opt_dax, Opt_dax_always, Opt_dax_inode, Opt_dax_never, > > Opt_stripe, Opt_delalloc, Opt_nodelalloc, Opt_warn_on_error, > > Opt_nowarn_on_error, Opt_mblk_io_submit, Opt_debug_want_extra_isize, > > @@ -1694,7 +1694,6 @@ static const struct fs_parameter_spec ext4_param_specs[] = { > > fsparam_flag ("barrier", Opt_barrier), > > fsparam_u32 ("barrier", Opt_barrier), > > fsparam_flag ("nobarrier", Opt_nobarrier), > > - fsparam_flag ("i_version", Opt_i_version), > > fsparam_flag ("dax", Opt_dax), > > fsparam_enum ("dax", Opt_dax_type, ext4_param_dax), > > fsparam_u32 ("stripe", Opt_stripe), > > @@ -2140,11 +2139,6 @@ static int ext4_parse_param(struct fs_context *fc, struct fs_parameter *param) > > case Opt_abort: > > ctx_set_mount_flag(ctx, EXT4_MF_FS_ABORTED); > > return 0; > > - case Opt_i_version: > > - ext4_msg(NULL, KERN_WARNING, deprecated_msg, param->key, "5.20"); > > - ext4_msg(NULL, KERN_WARNING, "Use iversion instead\n"); > > - ctx_set_flags(ctx, SB_I_VERSION); > > - return 0; > > case Opt_inlinecrypt: > > #ifdef CONFIG_FS_ENCRYPTION_INLINE_CRYPT > > ctx_set_flags(ctx, SB_INLINECRYPT); > > @@ -2814,14 +2808,6 @@ static void ext4_apply_options(struct fs_context *fc, struct super_block *sb) > > sb->s_flags &= ~ctx->mask_s_flags; > > sb->s_flags |= ctx->vals_s_flags; > > > > - /* > > - * i_version differs from common mount option iversion so we have > > - * to let vfs know that it was set, otherwise it would get cleared > > - * on remount > > - */ > > - if (ctx->mask_s_flags & SB_I_VERSION) > > - fc->sb_flags |= SB_I_VERSION; > > - > > #define APPLY(X) ({ if (ctx->spec & EXT4_SPEC_##X) sbi->X = ctx->X; }) > > APPLY(s_commit_interval); > > APPLY(s_stripe); > > @@ -2970,8 +2956,6 @@ static int _ext4_show_options(struct seq_file *seq, struct super_block *sb, > > SEQ_OPTS_PRINT("min_batch_time=%u", sbi->s_min_batch_time); > > if (nodefs || sbi->s_max_batch_time != EXT4_DEF_MAX_BATCH_TIME) > > SEQ_OPTS_PRINT("max_batch_time=%u", sbi->s_max_batch_time); > > - if (sb->s_flags & SB_I_VERSION) > > - SEQ_OPTS_PUTS("i_version"); > > if (nodefs || sbi->s_stripe) > > SEQ_OPTS_PRINT("stripe=%lu", sbi->s_stripe); > > if (nodefs || EXT4_MOUNT_DATA_FLAGS & > > @@ -4640,6 +4624,9 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb) > > sb->s_flags = (sb->s_flags & ~SB_POSIXACL) | > > (test_opt(sb, POSIX_ACL) ? SB_POSIXACL : 0); > > > > + /* i_version is always enabled now */ > > + sb->s_flags |= SB_I_VERSION; > > + > > if (le32_to_cpu(es->s_rev_level) == EXT4_GOOD_OLD_REV && > > (ext4_has_compat_features(sb) || > > ext4_has_ro_compat_features(sb) || > > Hi Lukas, > > I know I had originally asked you to shepherd this patch into mainline, > but I think it may be better to wait on it for now. Since I asked that, > we've since found out that ext4 is bumping the i_version counter on > atime updates. It'd be best to get that fixed before we turn this on > unconditionally, since it could cause a performance regression in some > cases. I'll plan to pick this back up for my latest i_version series if > that sounds ok to you. > > Sorry for the back and forth, and thanks again! Hi Jeff, sure, no problem. I can drop the patch. The rest of the series is still valid though. Thanks! -Lukas > > Cheers, > -- > Jeff Layton <jlayton@kernel.org> > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v4 3/3] ext4: unconditionally enable the i_version counter 2022-08-29 8:17 ` Lukas Czerner @ 2022-08-29 10:16 ` Jeff Layton 0 siblings, 0 replies; 13+ messages in thread From: Jeff Layton @ 2022-08-29 10:16 UTC (permalink / raw) To: Lukas Czerner Cc: linux-ext4, tytso, jack, linux-fsdevel, ebiggers, david, Benjamin Coddington, Christoph Hellwig, Darrick J . Wong, Christian Brauner On Mon, 2022-08-29 at 10:17 +0200, Lukas Czerner wrote: > On Fri, Aug 26, 2022 at 12:11:23PM -0400, Jeff Layton wrote: > > On Wed, 2022-08-24 at 18:03 +0200, Lukas Czerner wrote: > > > From: Jeff Layton <jlayton@kernel.org> > > > > > > The original i_version implementation was pretty expensive, requiring a > > > log flush on every change. Because of this, it was gated behind a mount > > > option (implemented via the MS_I_VERSION mountoption flag). > > > > > > Commit ae5e165d855d (fs: new API for handling inode->i_version) made the > > > i_version flag much less expensive, so there is no longer a performance > > > penalty from enabling it. xfs and btrfs already enable it > > > unconditionally when the on-disk format can support it. > > > > > > Have ext4 ignore the SB_I_VERSION flag, and just enable it > > > unconditionally. While we're in here, remove the handling of > > > Opt_i_version as well, since we're almost to 5.20 anyway. > > > > > > Ideally, we'd couple this change with a way to disable the i_version > > > counter (just in case), but the way the iversion mount option was > > > implemented makes that difficult to do. We'd need to add a new mount > > > option altogether or do something with tune2fs. That's probably best > > > left to later patches if it turns out to be needed. > > > > > > [ Removed leftover bits of i_version from ext4_apply_options() since it > > > now can't ever be set in ctx->mask_s_flags -- lczerner ] > > > > > > Cc: Dave Chinner <david@fromorbit.com> > > > Cc: Benjamin Coddington <bcodding@redhat.com> > > > Cc: Christoph Hellwig <hch@infradead.org> > > > Cc: Darrick J. Wong <djwong@kernel.org> > > > Signed-off-by: Jeff Layton <jlayton@kernel.org> > > > Signed-off-by: Lukas Czerner <lczerner@redhat.com> > > > Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> > > > Reviewed-by: Jan Kara <jack@suse.cz> > > > --- > > > v3: Removed leftover bits of i_version from ext4_apply_options > > > v4: no change > > > > > > fs/ext4/inode.c | 5 ++--- > > > fs/ext4/super.c | 21 ++++----------------- > > > 2 files changed, 6 insertions(+), 20 deletions(-) > > > > > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > > > index 2a220be34caa..c77d40f05763 100644 > > > --- a/fs/ext4/inode.c > > > +++ b/fs/ext4/inode.c > > > @@ -5425,7 +5425,7 @@ int ext4_setattr(struct user_namespace *mnt_userns, struct dentry *dentry, > > > return -EINVAL; > > > } > > > > > > - if (IS_I_VERSION(inode) && attr->ia_size != inode->i_size) > > > + if (attr->ia_size != inode->i_size) > > > inode_inc_iversion(inode); > > > > > > if (shrink) { > > > @@ -5735,8 +5735,7 @@ int ext4_mark_iloc_dirty(handle_t *handle, > > > * ea_inodes are using i_version for storing reference count, don't > > > * mess with it > > > */ > > > - if (IS_I_VERSION(inode) && > > > - !(EXT4_I(inode)->i_flags & EXT4_EA_INODE_FL)) > > > + if (!(EXT4_I(inode)->i_flags & EXT4_EA_INODE_FL)) > > > inode_inc_iversion(inode); > > > > > > /* the do_update_inode consumes one bh->b_count */ > > > diff --git a/fs/ext4/super.c b/fs/ext4/super.c > > > index 9a66abcca1a8..1c953f6d400e 100644 > > > --- a/fs/ext4/super.c > > > +++ b/fs/ext4/super.c > > > @@ -1585,7 +1585,7 @@ enum { > > > Opt_inlinecrypt, > > > Opt_usrjquota, Opt_grpjquota, Opt_quota, > > > Opt_noquota, Opt_barrier, Opt_nobarrier, Opt_err, > > > - Opt_usrquota, Opt_grpquota, Opt_prjquota, Opt_i_version, > > > + Opt_usrquota, Opt_grpquota, Opt_prjquota, > > > Opt_dax, Opt_dax_always, Opt_dax_inode, Opt_dax_never, > > > Opt_stripe, Opt_delalloc, Opt_nodelalloc, Opt_warn_on_error, > > > Opt_nowarn_on_error, Opt_mblk_io_submit, Opt_debug_want_extra_isize, > > > @@ -1694,7 +1694,6 @@ static const struct fs_parameter_spec ext4_param_specs[] = { > > > fsparam_flag ("barrier", Opt_barrier), > > > fsparam_u32 ("barrier", Opt_barrier), > > > fsparam_flag ("nobarrier", Opt_nobarrier), > > > - fsparam_flag ("i_version", Opt_i_version), > > > fsparam_flag ("dax", Opt_dax), > > > fsparam_enum ("dax", Opt_dax_type, ext4_param_dax), > > > fsparam_u32 ("stripe", Opt_stripe), > > > @@ -2140,11 +2139,6 @@ static int ext4_parse_param(struct fs_context *fc, struct fs_parameter *param) > > > case Opt_abort: > > > ctx_set_mount_flag(ctx, EXT4_MF_FS_ABORTED); > > > return 0; > > > - case Opt_i_version: > > > - ext4_msg(NULL, KERN_WARNING, deprecated_msg, param->key, "5.20"); > > > - ext4_msg(NULL, KERN_WARNING, "Use iversion instead\n"); > > > - ctx_set_flags(ctx, SB_I_VERSION); > > > - return 0; > > > case Opt_inlinecrypt: > > > #ifdef CONFIG_FS_ENCRYPTION_INLINE_CRYPT > > > ctx_set_flags(ctx, SB_INLINECRYPT); > > > @@ -2814,14 +2808,6 @@ static void ext4_apply_options(struct fs_context *fc, struct super_block *sb) > > > sb->s_flags &= ~ctx->mask_s_flags; > > > sb->s_flags |= ctx->vals_s_flags; > > > > > > - /* > > > - * i_version differs from common mount option iversion so we have > > > - * to let vfs know that it was set, otherwise it would get cleared > > > - * on remount > > > - */ > > > - if (ctx->mask_s_flags & SB_I_VERSION) > > > - fc->sb_flags |= SB_I_VERSION; > > > - > > > #define APPLY(X) ({ if (ctx->spec & EXT4_SPEC_##X) sbi->X = ctx->X; }) > > > APPLY(s_commit_interval); > > > APPLY(s_stripe); > > > @@ -2970,8 +2956,6 @@ static int _ext4_show_options(struct seq_file *seq, struct super_block *sb, > > > SEQ_OPTS_PRINT("min_batch_time=%u", sbi->s_min_batch_time); > > > if (nodefs || sbi->s_max_batch_time != EXT4_DEF_MAX_BATCH_TIME) > > > SEQ_OPTS_PRINT("max_batch_time=%u", sbi->s_max_batch_time); > > > - if (sb->s_flags & SB_I_VERSION) > > > - SEQ_OPTS_PUTS("i_version"); > > > if (nodefs || sbi->s_stripe) > > > SEQ_OPTS_PRINT("stripe=%lu", sbi->s_stripe); > > > if (nodefs || EXT4_MOUNT_DATA_FLAGS & > > > @@ -4640,6 +4624,9 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb) > > > sb->s_flags = (sb->s_flags & ~SB_POSIXACL) | > > > (test_opt(sb, POSIX_ACL) ? SB_POSIXACL : 0); > > > > > > + /* i_version is always enabled now */ > > > + sb->s_flags |= SB_I_VERSION; > > > + > > > if (le32_to_cpu(es->s_rev_level) == EXT4_GOOD_OLD_REV && > > > (ext4_has_compat_features(sb) || > > > ext4_has_ro_compat_features(sb) || > > > > Hi Lukas, > > > > I know I had originally asked you to shepherd this patch into mainline, > > but I think it may be better to wait on it for now. Since I asked that, > > we've since found out that ext4 is bumping the i_version counter on > > atime updates. It'd be best to get that fixed before we turn this on > > unconditionally, since it could cause a performance regression in some > > cases. I'll plan to pick this back up for my latest i_version series if > > that sounds ok to you. > > > > Sorry for the back and forth, and thanks again! > > Hi Jeff, > > sure, no problem. I can drop the patch. The rest of the series is still > valid though. > > Thanks! > -Lukas > > Yes, the rest is fine (AFAICT) Thanks! -- Jeff Layton <jlayton@kernel.org> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v4 1/3] ext4: don't increase iversion counter for ea_inodes 2022-08-24 16:03 [PATCH v4 1/3] ext4: don't increase iversion counter for ea_inodes Lukas Czerner 2022-08-24 16:03 ` [PATCH v4 2/3] fs: record I_DIRTY_TIME even if inode already has I_DIRTY_INODE Lukas Czerner 2022-08-24 16:03 ` [PATCH v4 3/3] ext4: unconditionally enable the i_version counter Lukas Czerner @ 2022-09-29 14:58 ` Theodore Ts'o 2 siblings, 0 replies; 13+ messages in thread From: Theodore Ts'o @ 2022-09-29 14:58 UTC (permalink / raw) To: linux-ext4, lczerner Cc: Theodore Ts'o, jlayton, ebiggers, jack, david, brauner, linux-fsdevel On Wed, 24 Aug 2022 18:03:47 +0200, Lukas Czerner wrote: > ea_inodes are using i_version for storing part of the reference count so > we really need to leave it alone. > > The problem can be reproduced by xfstest ext4/026 when iversion is > enabled. Fix it by not calling inode_inc_iversion() for EXT4_EA_INODE_FL > inodes in ext4_mark_iloc_dirty(). > > [...] Applied, thanks! [1/3] ext4: don't increase iversion counter for ea_inodes commit: 6c7c5ade428cc65b58e4aba1925b5347970f4456 [2/3] fs: record I_DIRTY_TIME even if inode already has I_DIRTY_INODE commit: 625e1e67b66245b93ccae868cd4a950d257de003 [3/3] ext4: unconditionally enable the i_version counter commit: 59772a0cb09a7ec77362653e8be207a464fa04af Best regards, -- Theodore Ts'o <tytso@mit.edu> ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v3 1/3] ext4: don't increase iversion counter for ea_inodes @ 2022-08-12 12:37 Lukas Czerner 2022-08-12 12:37 ` [PATCH v4 3/3] ext4: unconditionally enable the i_version counter Lukas Czerner 0 siblings, 1 reply; 13+ messages in thread From: Lukas Czerner @ 2022-08-12 12:37 UTC (permalink / raw) To: linux-ext4; +Cc: tytso, jlayton, jack, linux-fsdevel, ebiggers, david ea_inodes are using i_version for storing part of the reference count so we really need to leave it alone. The problem can be reproduced by xfstest ext4/026 when iversion is enabled. Fix it by not calling inode_inc_iversion() for EXT4_EA_INODE_FL inodes in ext4_mark_iloc_dirty(). Signed-off-by: Lukas Czerner <lczerner@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Jeff Layton <jlayton@kernel.org> --- v2, v3: no change fs/ext4/inode.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 601214453c3a..2a220be34caa 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -5731,7 +5731,12 @@ int ext4_mark_iloc_dirty(handle_t *handle, } ext4_fc_track_inode(handle, inode); - if (IS_I_VERSION(inode)) + /* + * ea_inodes are using i_version for storing reference count, don't + * mess with it + */ + if (IS_I_VERSION(inode) && + !(EXT4_I(inode)->i_flags & EXT4_EA_INODE_FL)) inode_inc_iversion(inode); /* the do_update_inode consumes one bh->b_count */ -- 2.37.1 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v4 3/3] ext4: unconditionally enable the i_version counter 2022-08-12 12:37 [PATCH v3 " Lukas Czerner @ 2022-08-12 12:37 ` Lukas Czerner 2022-08-12 13:05 ` Christian Brauner 2022-08-16 11:48 ` Jan Kara 0 siblings, 2 replies; 13+ messages in thread From: Lukas Czerner @ 2022-08-12 12:37 UTC (permalink / raw) To: linux-ext4 Cc: tytso, jlayton, jack, linux-fsdevel, ebiggers, david, Benjamin Coddington, Christoph Hellwig, Darrick J . Wong From: Jeff Layton <jlayton@kernel.org> The original i_version implementation was pretty expensive, requiring a log flush on every change. Because of this, it was gated behind a mount option (implemented via the MS_I_VERSION mountoption flag). Commit ae5e165d855d (fs: new API for handling inode->i_version) made the i_version flag much less expensive, so there is no longer a performance penalty from enabling it. xfs and btrfs already enable it unconditionally when the on-disk format can support it. Have ext4 ignore the SB_I_VERSION flag, and just enable it unconditionally. While we're in here, remove the handling of Opt_i_version as well, since we're almost to 5.20 anyway. Ideally, we'd couple this change with a way to disable the i_version counter (just in case), but the way the iversion mount option was implemented makes that difficult to do. We'd need to add a new mount option altogether or do something with tune2fs. That's probably best left to later patches if it turns out to be needed. [ Removed leftover bits of i_version from ext4_apply_options() since it now can't ever be set in ctx->mask_s_flags -- lczerner ] Cc: Dave Chinner <david@fromorbit.com> Cc: Benjamin Coddington <bcodding@redhat.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Lukas Czerner <lczerner@redhat.com> --- v3: Removed leftover bits of i_version from ext4_apply_options v4: no change fs/ext4/inode.c | 5 ++--- fs/ext4/super.c | 21 ++++----------------- 2 files changed, 6 insertions(+), 20 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 2a220be34caa..c77d40f05763 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -5425,7 +5425,7 @@ int ext4_setattr(struct user_namespace *mnt_userns, struct dentry *dentry, return -EINVAL; } - if (IS_I_VERSION(inode) && attr->ia_size != inode->i_size) + if (attr->ia_size != inode->i_size) inode_inc_iversion(inode); if (shrink) { @@ -5735,8 +5735,7 @@ int ext4_mark_iloc_dirty(handle_t *handle, * ea_inodes are using i_version for storing reference count, don't * mess with it */ - if (IS_I_VERSION(inode) && - !(EXT4_I(inode)->i_flags & EXT4_EA_INODE_FL)) + if (!(EXT4_I(inode)->i_flags & EXT4_EA_INODE_FL)) inode_inc_iversion(inode); /* the do_update_inode consumes one bh->b_count */ diff --git a/fs/ext4/super.c b/fs/ext4/super.c index 9a66abcca1a8..1c953f6d400e 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -1585,7 +1585,7 @@ enum { Opt_inlinecrypt, Opt_usrjquota, Opt_grpjquota, Opt_quota, Opt_noquota, Opt_barrier, Opt_nobarrier, Opt_err, - Opt_usrquota, Opt_grpquota, Opt_prjquota, Opt_i_version, + Opt_usrquota, Opt_grpquota, Opt_prjquota, Opt_dax, Opt_dax_always, Opt_dax_inode, Opt_dax_never, Opt_stripe, Opt_delalloc, Opt_nodelalloc, Opt_warn_on_error, Opt_nowarn_on_error, Opt_mblk_io_submit, Opt_debug_want_extra_isize, @@ -1694,7 +1694,6 @@ static const struct fs_parameter_spec ext4_param_specs[] = { fsparam_flag ("barrier", Opt_barrier), fsparam_u32 ("barrier", Opt_barrier), fsparam_flag ("nobarrier", Opt_nobarrier), - fsparam_flag ("i_version", Opt_i_version), fsparam_flag ("dax", Opt_dax), fsparam_enum ("dax", Opt_dax_type, ext4_param_dax), fsparam_u32 ("stripe", Opt_stripe), @@ -2140,11 +2139,6 @@ static int ext4_parse_param(struct fs_context *fc, struct fs_parameter *param) case Opt_abort: ctx_set_mount_flag(ctx, EXT4_MF_FS_ABORTED); return 0; - case Opt_i_version: - ext4_msg(NULL, KERN_WARNING, deprecated_msg, param->key, "5.20"); - ext4_msg(NULL, KERN_WARNING, "Use iversion instead\n"); - ctx_set_flags(ctx, SB_I_VERSION); - return 0; case Opt_inlinecrypt: #ifdef CONFIG_FS_ENCRYPTION_INLINE_CRYPT ctx_set_flags(ctx, SB_INLINECRYPT); @@ -2814,14 +2808,6 @@ static void ext4_apply_options(struct fs_context *fc, struct super_block *sb) sb->s_flags &= ~ctx->mask_s_flags; sb->s_flags |= ctx->vals_s_flags; - /* - * i_version differs from common mount option iversion so we have - * to let vfs know that it was set, otherwise it would get cleared - * on remount - */ - if (ctx->mask_s_flags & SB_I_VERSION) - fc->sb_flags |= SB_I_VERSION; - #define APPLY(X) ({ if (ctx->spec & EXT4_SPEC_##X) sbi->X = ctx->X; }) APPLY(s_commit_interval); APPLY(s_stripe); @@ -2970,8 +2956,6 @@ static int _ext4_show_options(struct seq_file *seq, struct super_block *sb, SEQ_OPTS_PRINT("min_batch_time=%u", sbi->s_min_batch_time); if (nodefs || sbi->s_max_batch_time != EXT4_DEF_MAX_BATCH_TIME) SEQ_OPTS_PRINT("max_batch_time=%u", sbi->s_max_batch_time); - if (sb->s_flags & SB_I_VERSION) - SEQ_OPTS_PUTS("i_version"); if (nodefs || sbi->s_stripe) SEQ_OPTS_PRINT("stripe=%lu", sbi->s_stripe); if (nodefs || EXT4_MOUNT_DATA_FLAGS & @@ -4640,6 +4624,9 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb) sb->s_flags = (sb->s_flags & ~SB_POSIXACL) | (test_opt(sb, POSIX_ACL) ? SB_POSIXACL : 0); + /* i_version is always enabled now */ + sb->s_flags |= SB_I_VERSION; + if (le32_to_cpu(es->s_rev_level) == EXT4_GOOD_OLD_REV && (ext4_has_compat_features(sb) || ext4_has_ro_compat_features(sb) || -- 2.37.1 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v4 3/3] ext4: unconditionally enable the i_version counter 2022-08-12 12:37 ` [PATCH v4 3/3] ext4: unconditionally enable the i_version counter Lukas Czerner @ 2022-08-12 13:05 ` Christian Brauner 2022-08-16 11:48 ` Jan Kara 1 sibling, 0 replies; 13+ messages in thread From: Christian Brauner @ 2022-08-12 13:05 UTC (permalink / raw) To: Lukas Czerner Cc: linux-ext4, tytso, jlayton, jack, linux-fsdevel, ebiggers, david, Benjamin Coddington, Christoph Hellwig, Darrick J . Wong On Fri, Aug 12, 2022 at 02:37:27PM +0200, Lukas Czerner wrote: > From: Jeff Layton <jlayton@kernel.org> > > The original i_version implementation was pretty expensive, requiring a > log flush on every change. Because of this, it was gated behind a mount > option (implemented via the MS_I_VERSION mountoption flag). > > Commit ae5e165d855d (fs: new API for handling inode->i_version) made the > i_version flag much less expensive, so there is no longer a performance > penalty from enabling it. xfs and btrfs already enable it > unconditionally when the on-disk format can support it. > > Have ext4 ignore the SB_I_VERSION flag, and just enable it > unconditionally. While we're in here, remove the handling of > Opt_i_version as well, since we're almost to 5.20 anyway. > > Ideally, we'd couple this change with a way to disable the i_version > counter (just in case), but the way the iversion mount option was > implemented makes that difficult to do. We'd need to add a new mount > option altogether or do something with tune2fs. That's probably best > left to later patches if it turns out to be needed. > > [ Removed leftover bits of i_version from ext4_apply_options() since it > now can't ever be set in ctx->mask_s_flags -- lczerner ] > > Cc: Dave Chinner <david@fromorbit.com> > Cc: Benjamin Coddington <bcodding@redhat.com> > Cc: Christoph Hellwig <hch@infradead.org> > Cc: Darrick J. Wong <djwong@kernel.org> > Signed-off-by: Jeff Layton <jlayton@kernel.org> > Signed-off-by: Lukas Czerner <lczerner@redhat.com> > --- Since ext4 seems to ignore unknown mount options in ext4_parse_param() removing seems good, Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v4 3/3] ext4: unconditionally enable the i_version counter 2022-08-12 12:37 ` [PATCH v4 3/3] ext4: unconditionally enable the i_version counter Lukas Czerner 2022-08-12 13:05 ` Christian Brauner @ 2022-08-16 11:48 ` Jan Kara 1 sibling, 0 replies; 13+ messages in thread From: Jan Kara @ 2022-08-16 11:48 UTC (permalink / raw) To: Lukas Czerner Cc: linux-ext4, tytso, jlayton, jack, linux-fsdevel, ebiggers, david, Benjamin Coddington, Christoph Hellwig, Darrick J . Wong On Fri 12-08-22 14:37:27, Lukas Czerner wrote: > From: Jeff Layton <jlayton@kernel.org> > > The original i_version implementation was pretty expensive, requiring a > log flush on every change. Because of this, it was gated behind a mount > option (implemented via the MS_I_VERSION mountoption flag). > > Commit ae5e165d855d (fs: new API for handling inode->i_version) made the > i_version flag much less expensive, so there is no longer a performance > penalty from enabling it. xfs and btrfs already enable it > unconditionally when the on-disk format can support it. > > Have ext4 ignore the SB_I_VERSION flag, and just enable it > unconditionally. While we're in here, remove the handling of > Opt_i_version as well, since we're almost to 5.20 anyway. > > Ideally, we'd couple this change with a way to disable the i_version > counter (just in case), but the way the iversion mount option was > implemented makes that difficult to do. We'd need to add a new mount > option altogether or do something with tune2fs. That's probably best > left to later patches if it turns out to be needed. > > [ Removed leftover bits of i_version from ext4_apply_options() since it > now can't ever be set in ctx->mask_s_flags -- lczerner ] > > Cc: Dave Chinner <david@fromorbit.com> > Cc: Benjamin Coddington <bcodding@redhat.com> > Cc: Christoph Hellwig <hch@infradead.org> > Cc: Darrick J. Wong <djwong@kernel.org> > Signed-off-by: Jeff Layton <jlayton@kernel.org> > Signed-off-by: Lukas Czerner <lczerner@redhat.com> Looks good to me. Feel free to add: Reviewed-by: Jan Kara <jack@suse.cz> Honza > --- > v3: Removed leftover bits of i_version from ext4_apply_options > v4: no change > > fs/ext4/inode.c | 5 ++--- > fs/ext4/super.c | 21 ++++----------------- > 2 files changed, 6 insertions(+), 20 deletions(-) > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > index 2a220be34caa..c77d40f05763 100644 > --- a/fs/ext4/inode.c > +++ b/fs/ext4/inode.c > @@ -5425,7 +5425,7 @@ int ext4_setattr(struct user_namespace *mnt_userns, struct dentry *dentry, > return -EINVAL; > } > > - if (IS_I_VERSION(inode) && attr->ia_size != inode->i_size) > + if (attr->ia_size != inode->i_size) > inode_inc_iversion(inode); > > if (shrink) { > @@ -5735,8 +5735,7 @@ int ext4_mark_iloc_dirty(handle_t *handle, > * ea_inodes are using i_version for storing reference count, don't > * mess with it > */ > - if (IS_I_VERSION(inode) && > - !(EXT4_I(inode)->i_flags & EXT4_EA_INODE_FL)) > + if (!(EXT4_I(inode)->i_flags & EXT4_EA_INODE_FL)) > inode_inc_iversion(inode); > > /* the do_update_inode consumes one bh->b_count */ > diff --git a/fs/ext4/super.c b/fs/ext4/super.c > index 9a66abcca1a8..1c953f6d400e 100644 > --- a/fs/ext4/super.c > +++ b/fs/ext4/super.c > @@ -1585,7 +1585,7 @@ enum { > Opt_inlinecrypt, > Opt_usrjquota, Opt_grpjquota, Opt_quota, > Opt_noquota, Opt_barrier, Opt_nobarrier, Opt_err, > - Opt_usrquota, Opt_grpquota, Opt_prjquota, Opt_i_version, > + Opt_usrquota, Opt_grpquota, Opt_prjquota, > Opt_dax, Opt_dax_always, Opt_dax_inode, Opt_dax_never, > Opt_stripe, Opt_delalloc, Opt_nodelalloc, Opt_warn_on_error, > Opt_nowarn_on_error, Opt_mblk_io_submit, Opt_debug_want_extra_isize, > @@ -1694,7 +1694,6 @@ static const struct fs_parameter_spec ext4_param_specs[] = { > fsparam_flag ("barrier", Opt_barrier), > fsparam_u32 ("barrier", Opt_barrier), > fsparam_flag ("nobarrier", Opt_nobarrier), > - fsparam_flag ("i_version", Opt_i_version), > fsparam_flag ("dax", Opt_dax), > fsparam_enum ("dax", Opt_dax_type, ext4_param_dax), > fsparam_u32 ("stripe", Opt_stripe), > @@ -2140,11 +2139,6 @@ static int ext4_parse_param(struct fs_context *fc, struct fs_parameter *param) > case Opt_abort: > ctx_set_mount_flag(ctx, EXT4_MF_FS_ABORTED); > return 0; > - case Opt_i_version: > - ext4_msg(NULL, KERN_WARNING, deprecated_msg, param->key, "5.20"); > - ext4_msg(NULL, KERN_WARNING, "Use iversion instead\n"); > - ctx_set_flags(ctx, SB_I_VERSION); > - return 0; > case Opt_inlinecrypt: > #ifdef CONFIG_FS_ENCRYPTION_INLINE_CRYPT > ctx_set_flags(ctx, SB_INLINECRYPT); > @@ -2814,14 +2808,6 @@ static void ext4_apply_options(struct fs_context *fc, struct super_block *sb) > sb->s_flags &= ~ctx->mask_s_flags; > sb->s_flags |= ctx->vals_s_flags; > > - /* > - * i_version differs from common mount option iversion so we have > - * to let vfs know that it was set, otherwise it would get cleared > - * on remount > - */ > - if (ctx->mask_s_flags & SB_I_VERSION) > - fc->sb_flags |= SB_I_VERSION; > - > #define APPLY(X) ({ if (ctx->spec & EXT4_SPEC_##X) sbi->X = ctx->X; }) > APPLY(s_commit_interval); > APPLY(s_stripe); > @@ -2970,8 +2956,6 @@ static int _ext4_show_options(struct seq_file *seq, struct super_block *sb, > SEQ_OPTS_PRINT("min_batch_time=%u", sbi->s_min_batch_time); > if (nodefs || sbi->s_max_batch_time != EXT4_DEF_MAX_BATCH_TIME) > SEQ_OPTS_PRINT("max_batch_time=%u", sbi->s_max_batch_time); > - if (sb->s_flags & SB_I_VERSION) > - SEQ_OPTS_PUTS("i_version"); > if (nodefs || sbi->s_stripe) > SEQ_OPTS_PRINT("stripe=%lu", sbi->s_stripe); > if (nodefs || EXT4_MOUNT_DATA_FLAGS & > @@ -4640,6 +4624,9 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb) > sb->s_flags = (sb->s_flags & ~SB_POSIXACL) | > (test_opt(sb, POSIX_ACL) ? SB_POSIXACL : 0); > > + /* i_version is always enabled now */ > + sb->s_flags |= SB_I_VERSION; > + > if (le32_to_cpu(es->s_rev_level) == EXT4_GOOD_OLD_REV && > (ext4_has_compat_features(sb) || > ext4_has_ro_compat_features(sb) || > -- > 2.37.1 > -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2022-09-29 14:59 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-08-24 16:03 [PATCH v4 1/3] ext4: don't increase iversion counter for ea_inodes Lukas Czerner 2022-08-24 16:03 ` [PATCH v4 2/3] fs: record I_DIRTY_TIME even if inode already has I_DIRTY_INODE Lukas Czerner 2022-08-24 17:31 ` Jan Kara 2022-08-25 10:06 ` [PATCH v5] " Lukas Czerner 2022-09-29 14:58 ` Theodore Ts'o 2022-08-24 16:03 ` [PATCH v4 3/3] ext4: unconditionally enable the i_version counter Lukas Czerner 2022-08-26 16:11 ` Jeff Layton 2022-08-29 8:17 ` Lukas Czerner 2022-08-29 10:16 ` Jeff Layton 2022-09-29 14:58 ` [PATCH v4 1/3] ext4: don't increase iversion counter for ea_inodes Theodore Ts'o -- strict thread matches above, loose matches on Subject: below -- 2022-08-12 12:37 [PATCH v3 " Lukas Czerner 2022-08-12 12:37 ` [PATCH v4 3/3] ext4: unconditionally enable the i_version counter Lukas Czerner 2022-08-12 13:05 ` Christian Brauner 2022-08-16 11:48 ` Jan Kara
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).