* [PATCH 1/2] ext4: don't increase iversion counter for ea_inodes
@ 2022-07-28 13:39 Lukas Czerner
2022-07-28 13:39 ` [PATCH 2/2] fs: record I_DIRTY_TIME even if inode already has I_DIRTY_INODE Lukas Czerner
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: Lukas Czerner @ 2022-07-28 13:39 UTC (permalink / raw)
To: linux-ext4; +Cc: jlayton, tytso, linux-fsdevel
ea_inodes are using i_version for storing part of the reference count so
we really need to leave it alone.
The problem can be reproduced by xfstest ext4/026 when iversion is
enabled. Fix it by not calling inode_inc_iversion() for EXT4_EA_INODE_FL
inodes in ext4_mark_iloc_dirty().
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
---
fs/ext4/inode.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 84c0eb55071d..b76554124224 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -5717,7 +5717,12 @@ int ext4_mark_iloc_dirty(handle_t *handle,
}
ext4_fc_track_inode(handle, inode);
- if (IS_I_VERSION(inode))
+ /*
+ * ea_inodes are using i_version for storing reference count, don't
+ * mess with it
+ */
+ if (IS_I_VERSION(inode) &&
+ !(EXT4_I(inode)->i_flags & EXT4_EA_INODE_FL))
inode_inc_iversion(inode);
/* the do_update_inode consumes one bh->b_count */
--
2.35.3
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH 2/2] fs: record I_DIRTY_TIME even if inode already has I_DIRTY_INODE 2022-07-28 13:39 [PATCH 1/2] ext4: don't increase iversion counter for ea_inodes Lukas Czerner @ 2022-07-28 13:39 ` Lukas Czerner 2022-07-28 16:53 ` Jan Kara 2022-07-29 4:05 ` Eric Biggers 2022-07-28 15:52 ` [PATCH 1/2] ext4: don't increase iversion counter for ea_inodes Jan Kara 2022-08-02 11:58 ` Jeff Layton 2 siblings, 2 replies; 9+ messages in thread From: Lukas Czerner @ 2022-07-28 13:39 UTC (permalink / raw) To: linux-ext4; +Cc: jlayton, tytso, linux-fsdevel Currently the I_DIRTY_TIME will never get set if the inode already has I_DIRTY_INODE with assumption that it supersedes I_DIRTY_TIME. That's true, however ext4 will only update the on-disk inode in ->dirty_inode(), not on actual writeback. As a result if the inode already has I_DIRTY_INODE state by the time we get to __mark_inode_dirty() only with I_DIRTY_TIME, the time was already filled into on-disk inode and will not get updated until the next I_DIRTY_INODE update, which might never come if we crash or get a power failure. The problem can be reproduced on ext4 by running xfstest generic/622 with -o iversion mount option. Fix it by setting I_DIRTY_TIME even if the inode already has I_DIRTY_INODE. Also clear the I_DIRTY_TIME after ->dirty_inode() otherwise it may never get cleared. Signed-off-by: Lukas Czerner <lczerner@redhat.com> --- fs/fs-writeback.c | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c index 05221366a16d..174f01e6b912 100644 --- a/fs/fs-writeback.c +++ b/fs/fs-writeback.c @@ -2383,6 +2383,11 @@ void __mark_inode_dirty(struct inode *inode, int flags) /* I_DIRTY_INODE supersedes I_DIRTY_TIME. */ flags &= ~I_DIRTY_TIME; + if (inode->i_state & I_DIRTY_TIME) { + spin_lock(&inode->i_lock); + inode->i_state &= ~I_DIRTY_TIME; + spin_unlock(&inode->i_lock); + } } else { /* * Else it's either I_DIRTY_PAGES, I_DIRTY_TIME, or nothing. @@ -2399,13 +2404,20 @@ void __mark_inode_dirty(struct inode *inode, int flags) */ smp_mb(); - if (((inode->i_state & flags) == flags) || - (dirtytime && (inode->i_state & I_DIRTY_INODE))) + if ((inode->i_state & flags) == flags) return; spin_lock(&inode->i_lock); - if (dirtytime && (inode->i_state & I_DIRTY_INODE)) + if (dirtytime && (inode->i_state & I_DIRTY_INODE)) { + /* + * We've got a new lazytime update. Make sure it's recorded in + * i_state, because the time might have already got updated in + * ->dirty_inode() and will not get updated until next + * I_DIRTY_INODE update. + */ + inode->i_state |= I_DIRTY_TIME; goto out_unlock_inode; + } if ((inode->i_state & flags) != flags) { const int was_dirty = inode->i_state & I_DIRTY; -- 2.35.3 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 2/2] fs: record I_DIRTY_TIME even if inode already has I_DIRTY_INODE 2022-07-28 13:39 ` [PATCH 2/2] fs: record I_DIRTY_TIME even if inode already has I_DIRTY_INODE Lukas Czerner @ 2022-07-28 16:53 ` Jan Kara 2022-07-29 8:52 ` Lukas Czerner 2022-07-29 4:05 ` Eric Biggers 1 sibling, 1 reply; 9+ messages in thread From: Jan Kara @ 2022-07-28 16:53 UTC (permalink / raw) To: Lukas Czerner; +Cc: linux-ext4, jlayton, tytso, linux-fsdevel On Thu 28-07-22 15:39:14, Lukas Czerner wrote: > Currently the I_DIRTY_TIME will never get set if the inode already has > I_DIRTY_INODE with assumption that it supersedes I_DIRTY_TIME. That's > true, however ext4 will only update the on-disk inode in > ->dirty_inode(), not on actual writeback. As a result if the inode > already has I_DIRTY_INODE state by the time we get to > __mark_inode_dirty() only with I_DIRTY_TIME, the time was already filled > into on-disk inode and will not get updated until the next I_DIRTY_INODE > update, which might never come if we crash or get a power failure. > > The problem can be reproduced on ext4 by running xfstest generic/622 > with -o iversion mount option. Fix it by setting I_DIRTY_TIME even if > the inode already has I_DIRTY_INODE. As a datapoint I've checked and XFS has the very same problem as ext4. > Also clear the I_DIRTY_TIME after ->dirty_inode() otherwise it may never > get cleared. > > Signed-off-by: Lukas Czerner <lczerner@redhat.com> > --- > fs/fs-writeback.c | 18 +++++++++++++++--- > 1 file changed, 15 insertions(+), 3 deletions(-) > > diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c > index 05221366a16d..174f01e6b912 100644 > --- a/fs/fs-writeback.c > +++ b/fs/fs-writeback.c > @@ -2383,6 +2383,11 @@ void __mark_inode_dirty(struct inode *inode, int flags) > > /* I_DIRTY_INODE supersedes I_DIRTY_TIME. */ > flags &= ~I_DIRTY_TIME; > + if (inode->i_state & I_DIRTY_TIME) { > + spin_lock(&inode->i_lock); > + inode->i_state &= ~I_DIRTY_TIME; > + spin_unlock(&inode->i_lock); > + } Hum, so this is a bit dangerous because inode->i_state may be inconsistent with the writeback list inode is queued in (wb->b_dirty_time) and these two are supposed to be in sync. So I rather think we need to make sure we go through the full round of 'update flags and writeback list' below in case we need to clear I_DIRTY_TIME from inode->i_state. > } else { > /* > * Else it's either I_DIRTY_PAGES, I_DIRTY_TIME, or nothing. > @@ -2399,13 +2404,20 @@ void __mark_inode_dirty(struct inode *inode, int flags) > */ > smp_mb(); > > - if (((inode->i_state & flags) == flags) || > - (dirtytime && (inode->i_state & I_DIRTY_INODE))) > + if ((inode->i_state & flags) == flags) > return; > > spin_lock(&inode->i_lock); > - if (dirtytime && (inode->i_state & I_DIRTY_INODE)) > + if (dirtytime && (inode->i_state & I_DIRTY_INODE)) { > + /* > + * We've got a new lazytime update. Make sure it's recorded in > + * i_state, because the time might have already got updated in > + * ->dirty_inode() and will not get updated until next > + * I_DIRTY_INODE update. > + */ > + inode->i_state |= I_DIRTY_TIME; > goto out_unlock_inode; > + } So I'm afraid this combination is not properly handled in writeback_single_inode() where we have at the end: if (!(inode->i_state & I_DIRTY_ALL)) inode_cgwb_move_to_attached(inode, wb); else if (!(inode->i_state & I_SYNC_QUEUED) && (inode->i_state & I_DIRTY)) redirty_tail_locked(inode, wb); So inode that had I_DIRTY_SYNC | I_DIRTY_TIME will not be properly refiled to wb->b_dirty_time list after writeback was done and I_DIRTY_SYNC got cleared. So we need to refine it to something like: if (!(inode->i_state & I_DIRTY_ALL)) inode_cgwb_move_to_attached(inode, wb); else if (!(inode->i_state & I_SYNC_QUEUED)) { if (inode->i_state & I_DIRTY) { redirty_tail_locked(inode, wb); } else if (inode->i_state & I_DIRTY_TIME) { inode->dirtied_when = jiffies; inode_io_list_move_locked(inode, wb, &wb->b_dirty_time); } } Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 2/2] fs: record I_DIRTY_TIME even if inode already has I_DIRTY_INODE 2022-07-28 16:53 ` Jan Kara @ 2022-07-29 8:52 ` Lukas Czerner 2022-07-29 11:18 ` Jan Kara 0 siblings, 1 reply; 9+ messages in thread From: Lukas Czerner @ 2022-07-29 8:52 UTC (permalink / raw) To: Jan Kara; +Cc: linux-ext4, jlayton, tytso, linux-fsdevel On Thu, Jul 28, 2022 at 06:53:32PM +0200, Jan Kara wrote: > On Thu 28-07-22 15:39:14, Lukas Czerner wrote: > > Currently the I_DIRTY_TIME will never get set if the inode already has > > I_DIRTY_INODE with assumption that it supersedes I_DIRTY_TIME. That's > > true, however ext4 will only update the on-disk inode in > > ->dirty_inode(), not on actual writeback. As a result if the inode > > already has I_DIRTY_INODE state by the time we get to > > __mark_inode_dirty() only with I_DIRTY_TIME, the time was already filled > > into on-disk inode and will not get updated until the next I_DIRTY_INODE > > update, which might never come if we crash or get a power failure. > > > > The problem can be reproduced on ext4 by running xfstest generic/622 > > with -o iversion mount option. Fix it by setting I_DIRTY_TIME even if > > the inode already has I_DIRTY_INODE. Hi Jan, thanks for th review. > > As a datapoint I've checked and XFS has the very same problem as ext4. Very interesting, I did look at xfs when I was debugging this problem and wans't able to tell whether they have the same problem or not, but it certainly can't be reproduced by generic/622. Or at least I can't reproduce it on XFS. So I wonder what is XFS doing differently in that case. > > > Also clear the I_DIRTY_TIME after ->dirty_inode() otherwise it may never > > get cleared. > > > > Signed-off-by: Lukas Czerner <lczerner@redhat.com> > > --- > > fs/fs-writeback.c | 18 +++++++++++++++--- > > 1 file changed, 15 insertions(+), 3 deletions(-) > > > > diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c > > index 05221366a16d..174f01e6b912 100644 > > --- a/fs/fs-writeback.c > > +++ b/fs/fs-writeback.c > > @@ -2383,6 +2383,11 @@ void __mark_inode_dirty(struct inode *inode, int flags) > > > > /* I_DIRTY_INODE supersedes I_DIRTY_TIME. */ > > flags &= ~I_DIRTY_TIME; > > + if (inode->i_state & I_DIRTY_TIME) { > > + spin_lock(&inode->i_lock); > > + inode->i_state &= ~I_DIRTY_TIME; > > + spin_unlock(&inode->i_lock); > > + } > > Hum, so this is a bit dangerous because inode->i_state may be inconsistent > with the writeback list inode is queued in (wb->b_dirty_time) and these two > are supposed to be in sync. So I rather think we need to make sure we go > through the full round of 'update flags and writeback list' below in case > we need to clear I_DIRTY_TIME from inode->i_state. Ok, so we're clearing I_DIRTY_TIME in __ext4_update_other_inode_time() which will opportunistically update the time fields for inodes in the same block as the inode we're doing an update for via ext4_do_update_inode(). Don't we also need to rewire that differently? XFS is also clearing it on it's own in log code, but I can't tell if it has the same problem as you describe here. > > > } else { > > /* > > * Else it's either I_DIRTY_PAGES, I_DIRTY_TIME, or nothing. > > @@ -2399,13 +2404,20 @@ void __mark_inode_dirty(struct inode *inode, int flags) > > */ > > smp_mb(); > > > > - if (((inode->i_state & flags) == flags) || > > - (dirtytime && (inode->i_state & I_DIRTY_INODE))) > > + if ((inode->i_state & flags) == flags) > > return; > > > > spin_lock(&inode->i_lock); > > - if (dirtytime && (inode->i_state & I_DIRTY_INODE)) > > + if (dirtytime && (inode->i_state & I_DIRTY_INODE)) { > > + /* > > + * We've got a new lazytime update. Make sure it's recorded in > > + * i_state, because the time might have already got updated in > > + * ->dirty_inode() and will not get updated until next > > + * I_DIRTY_INODE update. > > + */ > > + inode->i_state |= I_DIRTY_TIME; > > goto out_unlock_inode; > > + } > > So I'm afraid this combination is not properly handled in > writeback_single_inode() where we have at the end: > > if (!(inode->i_state & I_DIRTY_ALL)) > inode_cgwb_move_to_attached(inode, wb); > else if (!(inode->i_state & I_SYNC_QUEUED) && > (inode->i_state & I_DIRTY)) > redirty_tail_locked(inode, wb); > > So inode that had I_DIRTY_SYNC | I_DIRTY_TIME will not be properly refiled > to wb->b_dirty_time list after writeback was done and I_DIRTY_SYNC got > cleared. > > So we need to refine it to something like: > > if (!(inode->i_state & I_DIRTY_ALL)) > inode_cgwb_move_to_attached(inode, wb); > else if (!(inode->i_state & I_SYNC_QUEUED)) { > if (inode->i_state & I_DIRTY) { > redirty_tail_locked(inode, wb); > } else if (inode->i_state & I_DIRTY_TIME) { > inode->dirtied_when = jiffies; > inode_io_list_move_locked(inode, wb, &wb->b_dirty_time); > } > } Very nice, thanks, I'll have a look. -Lukas > > Honza > -- > Jan Kara <jack@suse.com> > SUSE Labs, CR > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 2/2] fs: record I_DIRTY_TIME even if inode already has I_DIRTY_INODE 2022-07-29 8:52 ` Lukas Czerner @ 2022-07-29 11:18 ` Jan Kara 0 siblings, 0 replies; 9+ messages in thread From: Jan Kara @ 2022-07-29 11:18 UTC (permalink / raw) To: Lukas Czerner; +Cc: Jan Kara, linux-ext4, jlayton, tytso, linux-fsdevel On Fri 29-07-22 10:52:19, Lukas Czerner wrote: > On Thu, Jul 28, 2022 at 06:53:32PM +0200, Jan Kara wrote: > > On Thu 28-07-22 15:39:14, Lukas Czerner wrote: > > > Currently the I_DIRTY_TIME will never get set if the inode already has > > > I_DIRTY_INODE with assumption that it supersedes I_DIRTY_TIME. That's > > > true, however ext4 will only update the on-disk inode in > > > ->dirty_inode(), not on actual writeback. As a result if the inode > > > already has I_DIRTY_INODE state by the time we get to > > > __mark_inode_dirty() only with I_DIRTY_TIME, the time was already filled > > > into on-disk inode and will not get updated until the next I_DIRTY_INODE > > > update, which might never come if we crash or get a power failure. > > > > > > The problem can be reproduced on ext4 by running xfstest generic/622 > > > with -o iversion mount option. Fix it by setting I_DIRTY_TIME even if > > > the inode already has I_DIRTY_INODE. > > Hi Jan, > > thanks for th review. > > > > > As a datapoint I've checked and XFS has the very same problem as ext4. > > Very interesting, I did look at xfs when I was debugging this problem > and wans't able to tell whether they have the same problem or not, but > it certainly can't be reproduced by generic/622. Or at least I can't > reproduce it on XFS. > > So I wonder what is XFS doing differently in that case. OK, that's a bit curious but xfs has xfs_fs_dirty_inode() that's there exactly to update timestamps when lazytime period expires. So in theory it seems possible we lose the timestamp update. > > > Also clear the I_DIRTY_TIME after ->dirty_inode() otherwise it may never > > > get cleared. > > > > > > Signed-off-by: Lukas Czerner <lczerner@redhat.com> > > > --- > > > fs/fs-writeback.c | 18 +++++++++++++++--- > > > 1 file changed, 15 insertions(+), 3 deletions(-) > > > > > > diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c > > > index 05221366a16d..174f01e6b912 100644 > > > --- a/fs/fs-writeback.c > > > +++ b/fs/fs-writeback.c > > > @@ -2383,6 +2383,11 @@ void __mark_inode_dirty(struct inode *inode, int flags) > > > > > > /* I_DIRTY_INODE supersedes I_DIRTY_TIME. */ > > > flags &= ~I_DIRTY_TIME; > > > + if (inode->i_state & I_DIRTY_TIME) { > > > + spin_lock(&inode->i_lock); > > > + inode->i_state &= ~I_DIRTY_TIME; > > > + spin_unlock(&inode->i_lock); > > > + } > > > > Hum, so this is a bit dangerous because inode->i_state may be inconsistent > > with the writeback list inode is queued in (wb->b_dirty_time) and these two > > are supposed to be in sync. So I rather think we need to make sure we go > > through the full round of 'update flags and writeback list' below in case > > we need to clear I_DIRTY_TIME from inode->i_state. > > Ok, so we're clearing I_DIRTY_TIME in __ext4_update_other_inode_time() > which will opportunistically update the time fields for inodes in the > same block as the inode we're doing an update for via > ext4_do_update_inode(). Don't we also need to rewire that differently? > > XFS is also clearing it on it's own in log code, but I can't tell if it > has the same problem as you describe here. Yes, we'll possibly have clean inodes still on wb->b_dirty_time list. Checking the code, this should be safe in the end. But thinking more about the possible races these two places clearing I_DIRTY_TIME are safe because we copy timestamps to on-disk inode after clearing I_DIRTY_TIME. But your clearing of I_DIRTY_TIME in __mark_inode_dirty() could result in loosing timestamp update if it races in the wrong way (basically the bug you're trying to fix would remain unfixed). Hum, thinking about it, even clearing of I_DIRTY_TIME later in __mark_inode_dirty is problematic. There is still a race like: CPU1 CPU2 __mark_inode_dirty(inode, I_DIRTY_TIME) sets I_DIRTY_TIME in inode->i_state __mark_inode_dirty(inode, I_DIRTY_SYNC) ->dirty_inode() - copies timestamps __mark_inode_dirty(inode, I_DIRTY_TIME) I_DIRTY_TIME already set -> bail ... if (flags & I_DIRTY_INODE) inode->i_state &= ~I_DIRTY_TIME; and we have just lost the second timestamp update. To fix this we'd need to clear I_DIRTY_TIME in inode->i_state before calling ->dirty_inode() but that clashes with XFS' usage of ->dirty_inode which uses I_DIRTY_TIME in inode->i_state to detect that timestamp update is requested. I think we could do something like: if (flags & I_DIRTY_INODE) { /* Inode timestamp update will piggback on this dirtying */ if (inode->i_state & I_DIRTY_TIME) { spin_lock(&inode->i_lock); if (inode->i_state & I_DIRTY_TIME) { inode->i_state &= ~I_DIRTY_TIME; flags |= I_DIRTY_TIME; } spin_unlock(&inode->i_lock); } ... if (sb->s_op->dirty_inode) sb->s_op->dirty_inode(inode, flags & (I_DIRTY_INODE | I_DIRTY_TIME)); ... } And then XFS could check for I_DIRTY_TIME in flags to detect what it needs to do. Hopefully now things are correct ;). Famous last words... Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 2/2] fs: record I_DIRTY_TIME even if inode already has I_DIRTY_INODE 2022-07-28 13:39 ` [PATCH 2/2] fs: record I_DIRTY_TIME even if inode already has I_DIRTY_INODE Lukas Czerner 2022-07-28 16:53 ` Jan Kara @ 2022-07-29 4:05 ` Eric Biggers 2022-07-29 8:54 ` Lukas Czerner 1 sibling, 1 reply; 9+ messages in thread From: Eric Biggers @ 2022-07-29 4:05 UTC (permalink / raw) To: Lukas Czerner; +Cc: linux-ext4, jlayton, tytso, linux-fsdevel On Thu, Jul 28, 2022 at 03:39:14PM +0200, Lukas Czerner wrote: > Currently the I_DIRTY_TIME will never get set if the inode already has > I_DIRTY_INODE with assumption that it supersedes I_DIRTY_TIME. That's > true, however ext4 will only update the on-disk inode in > ->dirty_inode(), not on actual writeback. As a result if the inode > already has I_DIRTY_INODE state by the time we get to > __mark_inode_dirty() only with I_DIRTY_TIME, the time was already filled > into on-disk inode and will not get updated until the next I_DIRTY_INODE > update, which might never come if we crash or get a power failure. > > The problem can be reproduced on ext4 by running xfstest generic/622 > with -o iversion mount option. Fix it by setting I_DIRTY_TIME even if > the inode already has I_DIRTY_INODE. > > Also clear the I_DIRTY_TIME after ->dirty_inode() otherwise it may never > get cleared. > > Signed-off-by: Lukas Czerner <lczerner@redhat.com> If you're going to change the meaning of I_* flags, please update the comment in include/linux/fs.h that describes what they mean. - Eric ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 2/2] fs: record I_DIRTY_TIME even if inode already has I_DIRTY_INODE 2022-07-29 4:05 ` Eric Biggers @ 2022-07-29 8:54 ` Lukas Czerner 0 siblings, 0 replies; 9+ messages in thread From: Lukas Czerner @ 2022-07-29 8:54 UTC (permalink / raw) To: Eric Biggers; +Cc: linux-ext4, jlayton, tytso, linux-fsdevel On Thu, Jul 28, 2022 at 09:05:11PM -0700, Eric Biggers wrote: > On Thu, Jul 28, 2022 at 03:39:14PM +0200, Lukas Czerner wrote: > > Currently the I_DIRTY_TIME will never get set if the inode already has > > I_DIRTY_INODE with assumption that it supersedes I_DIRTY_TIME. That's > > true, however ext4 will only update the on-disk inode in > > ->dirty_inode(), not on actual writeback. As a result if the inode > > already has I_DIRTY_INODE state by the time we get to > > __mark_inode_dirty() only with I_DIRTY_TIME, the time was already filled > > into on-disk inode and will not get updated until the next I_DIRTY_INODE > > update, which might never come if we crash or get a power failure. > > > > The problem can be reproduced on ext4 by running xfstest generic/622 > > with -o iversion mount option. Fix it by setting I_DIRTY_TIME even if > > the inode already has I_DIRTY_INODE. > > > > Also clear the I_DIRTY_TIME after ->dirty_inode() otherwise it may never > > get cleared. > > > > Signed-off-by: Lukas Czerner <lczerner@redhat.com> > > If you're going to change the meaning of I_* flags, please update the comment in > include/linux/fs.h that describes what they mean. > > - Eric Good point, it does say that I_DIRTY_TIME and I_DIRTY_INODE can't be both set. Thanks! -Lukas ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] ext4: don't increase iversion counter for ea_inodes 2022-07-28 13:39 [PATCH 1/2] ext4: don't increase iversion counter for ea_inodes Lukas Czerner 2022-07-28 13:39 ` [PATCH 2/2] fs: record I_DIRTY_TIME even if inode already has I_DIRTY_INODE Lukas Czerner @ 2022-07-28 15:52 ` Jan Kara 2022-08-02 11:58 ` Jeff Layton 2 siblings, 0 replies; 9+ messages in thread From: Jan Kara @ 2022-07-28 15:52 UTC (permalink / raw) To: Lukas Czerner; +Cc: linux-ext4, jlayton, tytso, linux-fsdevel On Thu 28-07-22 15:39:13, Lukas Czerner wrote: > ea_inodes are using i_version for storing part of the reference count so > we really need to leave it alone. > > The problem can be reproduced by xfstest ext4/026 when iversion is > enabled. Fix it by not calling inode_inc_iversion() for EXT4_EA_INODE_FL > inodes in ext4_mark_iloc_dirty(). > > Signed-off-by: Lukas Czerner <lczerner@redhat.com> Looks good. Feel free to add: Reviewed-by: Jan Kara <jack@suse.cz> Honza > --- > fs/ext4/inode.c | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > index 84c0eb55071d..b76554124224 100644 > --- a/fs/ext4/inode.c > +++ b/fs/ext4/inode.c > @@ -5717,7 +5717,12 @@ int ext4_mark_iloc_dirty(handle_t *handle, > } > ext4_fc_track_inode(handle, inode); > > - if (IS_I_VERSION(inode)) > + /* > + * ea_inodes are using i_version for storing reference count, don't > + * mess with it > + */ > + if (IS_I_VERSION(inode) && > + !(EXT4_I(inode)->i_flags & EXT4_EA_INODE_FL)) > inode_inc_iversion(inode); > > /* the do_update_inode consumes one bh->b_count */ > -- > 2.35.3 > -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] ext4: don't increase iversion counter for ea_inodes 2022-07-28 13:39 [PATCH 1/2] ext4: don't increase iversion counter for ea_inodes Lukas Czerner 2022-07-28 13:39 ` [PATCH 2/2] fs: record I_DIRTY_TIME even if inode already has I_DIRTY_INODE Lukas Czerner 2022-07-28 15:52 ` [PATCH 1/2] ext4: don't increase iversion counter for ea_inodes Jan Kara @ 2022-08-02 11:58 ` Jeff Layton 2 siblings, 0 replies; 9+ messages in thread From: Jeff Layton @ 2022-08-02 11:58 UTC (permalink / raw) To: Lukas Czerner, linux-ext4; +Cc: tytso, linux-fsdevel On Thu, 2022-07-28 at 15:39 +0200, Lukas Czerner wrote: > ea_inodes are using i_version for storing part of the reference count so > we really need to leave it alone. > > The problem can be reproduced by xfstest ext4/026 when iversion is > enabled. Fix it by not calling inode_inc_iversion() for EXT4_EA_INODE_FL > inodes in ext4_mark_iloc_dirty(). > > Signed-off-by: Lukas Czerner <lczerner@redhat.com> > --- > fs/ext4/inode.c | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > index 84c0eb55071d..b76554124224 100644 > --- a/fs/ext4/inode.c > +++ b/fs/ext4/inode.c > @@ -5717,7 +5717,12 @@ int ext4_mark_iloc_dirty(handle_t *handle, > } > ext4_fc_track_inode(handle, inode); > > - if (IS_I_VERSION(inode)) > + /* > + * ea_inodes are using i_version for storing reference count, don't > + * mess with it > + */ > + if (IS_I_VERSION(inode) && > + !(EXT4_I(inode)->i_flags & EXT4_EA_INODE_FL)) > inode_inc_iversion(inode); > > /* the do_update_inode consumes one bh->b_count */ Reviewed-by: Jeff Layton <jlayton@kernel.org> ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2022-08-02 11:58 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-07-28 13:39 [PATCH 1/2] ext4: don't increase iversion counter for ea_inodes Lukas Czerner 2022-07-28 13:39 ` [PATCH 2/2] fs: record I_DIRTY_TIME even if inode already has I_DIRTY_INODE Lukas Czerner 2022-07-28 16:53 ` Jan Kara 2022-07-29 8:52 ` Lukas Czerner 2022-07-29 11:18 ` Jan Kara 2022-07-29 4:05 ` Eric Biggers 2022-07-29 8:54 ` Lukas Czerner 2022-07-28 15:52 ` [PATCH 1/2] ext4: don't increase iversion counter for ea_inodes Jan Kara 2022-08-02 11:58 ` Jeff Layton
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).