* [PATCH v3 0/2] ext4: fix fast commit inode enqueueing during a full journal commit
@ 2024-05-29 9:20 Luis Henriques (SUSE)
2024-05-29 9:20 ` [PATCH v3 1/2] " Luis Henriques (SUSE)
` (3 more replies)
0 siblings, 4 replies; 14+ messages in thread
From: Luis Henriques (SUSE) @ 2024-05-29 9:20 UTC (permalink / raw)
To: Theodore Ts'o, Andreas Dilger, Jan Kara, Harshad Shirwadkar
Cc: linux-ext4, linux-kernel, Luis Henriques (SUSE)
Hi!
Here's v3 of this fix to the fast commit enqueuing bug triggered by fstest
generic/047. This version simplifies the previous patch version by re-using
the i_sync_tid field in struct ext4_inode_info instead of adding a new one.
The extra patch includes a few extra fixes to the tid_t type handling. Jan
brought to my attention the fact that this sequence number may wrap, and I
quickly found a few places in the code where the tid_geq() and tid_gt()
helpers had to be used.
Again, please note that this fix requires [1] to be applied too.
[1] https://lore.kernel.org/all/20240515082857.32730-1-luis.henriques@linux.dev
Luis Henriques (SUSE) (2):
ext4: fix fast commit inode enqueueing during a full journal commit
ext4: fix possible tid_t sequence overflows
fs/ext4/fast_commit.c | 17 +++++++++++++----
1 file changed, 13 insertions(+), 4 deletions(-)
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v3 1/2] ext4: fix fast commit inode enqueueing during a full journal commit
2024-05-29 9:20 [PATCH v3 0/2] ext4: fix fast commit inode enqueueing during a full journal commit Luis Henriques (SUSE)
@ 2024-05-29 9:20 ` Luis Henriques (SUSE)
2024-05-29 9:50 ` Jan Kara
2024-07-09 3:59 ` Theodore Ts'o
2024-05-29 9:20 ` [PATCH v3 2/2] ext4: fix possible tid_t sequence overflows Luis Henriques (SUSE)
` (2 subsequent siblings)
3 siblings, 2 replies; 14+ messages in thread
From: Luis Henriques (SUSE) @ 2024-05-29 9:20 UTC (permalink / raw)
To: Theodore Ts'o, Andreas Dilger, Jan Kara, Harshad Shirwadkar
Cc: linux-ext4, linux-kernel, Luis Henriques (SUSE)
When a full journal commit is on-going, any fast commit has to be enqueued
into a different queue: FC_Q_STAGING instead of FC_Q_MAIN. This enqueueing
is done only once, i.e. if an inode is already queued in a previous fast
commit entry it won't be enqueued again. However, if a full commit starts
_after_ the inode is enqueued into FC_Q_MAIN, the next fast commit needs to
be done into FC_Q_STAGING. And this is not being done in function
ext4_fc_track_template().
This patch fixes the issue by re-enqueuing an inode into the STAGING queue
during the fast commit clean-up callback if it has a tid (i_sync_tid)
greater than the one being handled. The STAGING queue will then be spliced
back into MAIN.
This bug was found using fstest generic/047. This test creates several 32k
bytes files, sync'ing each of them after it's creation, and then shutting
down the filesystem. Some data may be loss in this operation; for example a
file may have it's size truncated to zero.
Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
---
fs/ext4/fast_commit.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
index 87c009e0c59a..088bd509b116 100644
--- a/fs/ext4/fast_commit.c
+++ b/fs/ext4/fast_commit.c
@@ -1282,8 +1282,17 @@ static void ext4_fc_cleanup(journal_t *journal, int full, tid_t tid)
list_del_init(&iter->i_fc_list);
ext4_clear_inode_state(&iter->vfs_inode,
EXT4_STATE_FC_COMMITTING);
- if (iter->i_sync_tid <= tid)
+ if (iter->i_sync_tid <= tid) {
ext4_fc_reset_inode(&iter->vfs_inode);
+ } else {
+ /*
+ * re-enqueue inode into STAGING, which later will be
+ * splice back into MAIN
+ */
+ list_add_tail(&EXT4_I(&iter->vfs_inode)->i_fc_list,
+ &sbi->s_fc_q[FC_Q_STAGING]);
+ }
+
/* Make sure EXT4_STATE_FC_COMMITTING bit is clear */
smp_mb();
#if (BITS_PER_LONG < 64)
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v3 2/2] ext4: fix possible tid_t sequence overflows
2024-05-29 9:20 [PATCH v3 0/2] ext4: fix fast commit inode enqueueing during a full journal commit Luis Henriques (SUSE)
2024-05-29 9:20 ` [PATCH v3 1/2] " Luis Henriques (SUSE)
@ 2024-05-29 9:20 ` Luis Henriques (SUSE)
2024-05-29 9:51 ` Jan Kara
2024-06-27 13:54 ` [PATCH v3 0/2] ext4: fix fast commit inode enqueueing during a full journal commit Luis Henriques
2024-07-11 2:35 ` Theodore Ts'o
3 siblings, 1 reply; 14+ messages in thread
From: Luis Henriques (SUSE) @ 2024-05-29 9:20 UTC (permalink / raw)
To: Theodore Ts'o, Andreas Dilger, Jan Kara, Harshad Shirwadkar
Cc: linux-ext4, linux-kernel, Luis Henriques (SUSE)
In the fast commit code there are a few places where tid_t variables are
being compared without taking into account the fact that these sequence
numbers may wrap. Fix this issue by using the helper functions tid_gt()
and tid_geq().
Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
---
fs/ext4/fast_commit.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
index 088bd509b116..30d312e16916 100644
--- a/fs/ext4/fast_commit.c
+++ b/fs/ext4/fast_commit.c
@@ -353,7 +353,7 @@ void ext4_fc_mark_ineligible(struct super_block *sb, int reason, handle_t *handl
read_unlock(&sbi->s_journal->j_state_lock);
}
spin_lock(&sbi->s_fc_lock);
- if (sbi->s_fc_ineligible_tid < tid)
+ if (tid_gt(tid, sbi->s_fc_ineligible_tid))
sbi->s_fc_ineligible_tid = tid;
spin_unlock(&sbi->s_fc_lock);
WARN_ON(reason >= EXT4_FC_REASON_MAX);
@@ -1207,7 +1207,7 @@ int ext4_fc_commit(journal_t *journal, tid_t commit_tid)
if (ret == -EALREADY) {
/* There was an ongoing commit, check if we need to restart */
if (atomic_read(&sbi->s_fc_subtid) <= subtid &&
- commit_tid > journal->j_commit_sequence)
+ tid_gt(commit_tid, journal->j_commit_sequence))
goto restart_fc;
ext4_fc_update_stats(sb, EXT4_FC_STATUS_SKIPPED, 0, 0,
commit_tid);
@@ -1282,7 +1282,7 @@ static void ext4_fc_cleanup(journal_t *journal, int full, tid_t tid)
list_del_init(&iter->i_fc_list);
ext4_clear_inode_state(&iter->vfs_inode,
EXT4_STATE_FC_COMMITTING);
- if (iter->i_sync_tid <= tid) {
+ if (tid_geq(tid, iter->i_sync_tid)) {
ext4_fc_reset_inode(&iter->vfs_inode);
} else {
/*
@@ -1322,7 +1322,7 @@ static void ext4_fc_cleanup(journal_t *journal, int full, tid_t tid)
list_splice_init(&sbi->s_fc_q[FC_Q_STAGING],
&sbi->s_fc_q[FC_Q_MAIN]);
- if (tid >= sbi->s_fc_ineligible_tid) {
+ if (tid_geq(tid, sbi->s_fc_ineligible_tid)) {
sbi->s_fc_ineligible_tid = 0;
ext4_clear_mount_flag(sb, EXT4_MF_FC_INELIGIBLE);
}
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v3 1/2] ext4: fix fast commit inode enqueueing during a full journal commit
2024-05-29 9:20 ` [PATCH v3 1/2] " Luis Henriques (SUSE)
@ 2024-05-29 9:50 ` Jan Kara
2024-05-29 16:52 ` harshad shirwadkar
2024-07-09 3:59 ` Theodore Ts'o
1 sibling, 1 reply; 14+ messages in thread
From: Jan Kara @ 2024-05-29 9:50 UTC (permalink / raw)
To: Luis Henriques (SUSE)
Cc: Theodore Ts'o, Andreas Dilger, Jan Kara, Harshad Shirwadkar,
linux-ext4, linux-kernel
On Wed 29-05-24 10:20:29, Luis Henriques (SUSE) wrote:
> When a full journal commit is on-going, any fast commit has to be enqueued
> into a different queue: FC_Q_STAGING instead of FC_Q_MAIN. This enqueueing
> is done only once, i.e. if an inode is already queued in a previous fast
> commit entry it won't be enqueued again. However, if a full commit starts
> _after_ the inode is enqueued into FC_Q_MAIN, the next fast commit needs to
> be done into FC_Q_STAGING. And this is not being done in function
> ext4_fc_track_template().
>
> This patch fixes the issue by re-enqueuing an inode into the STAGING queue
> during the fast commit clean-up callback if it has a tid (i_sync_tid)
> greater than the one being handled. The STAGING queue will then be spliced
> back into MAIN.
>
> This bug was found using fstest generic/047. This test creates several 32k
> bytes files, sync'ing each of them after it's creation, and then shutting
> down the filesystem. Some data may be loss in this operation; for example a
> file may have it's size truncated to zero.
>
> Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
Looks good to me! Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Just a typo correction below.
> diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
> index 87c009e0c59a..088bd509b116 100644
> --- a/fs/ext4/fast_commit.c
> +++ b/fs/ext4/fast_commit.c
> @@ -1282,8 +1282,17 @@ static void ext4_fc_cleanup(journal_t *journal, int full, tid_t tid)
> list_del_init(&iter->i_fc_list);
> ext4_clear_inode_state(&iter->vfs_inode,
> EXT4_STATE_FC_COMMITTING);
> - if (iter->i_sync_tid <= tid)
> + if (iter->i_sync_tid <= tid) {
> ext4_fc_reset_inode(&iter->vfs_inode);
> + } else {
> + /*
> + * re-enqueue inode into STAGING, which later will be
> + * splice back into MAIN
^^^ spliced
> + */
> + list_add_tail(&EXT4_I(&iter->vfs_inode)->i_fc_list,
> + &sbi->s_fc_q[FC_Q_STAGING]);
> + }
> +
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 2/2] ext4: fix possible tid_t sequence overflows
2024-05-29 9:20 ` [PATCH v3 2/2] ext4: fix possible tid_t sequence overflows Luis Henriques (SUSE)
@ 2024-05-29 9:51 ` Jan Kara
2024-05-29 16:51 ` harshad shirwadkar
0 siblings, 1 reply; 14+ messages in thread
From: Jan Kara @ 2024-05-29 9:51 UTC (permalink / raw)
To: Luis Henriques (SUSE)
Cc: Theodore Ts'o, Andreas Dilger, Jan Kara, Harshad Shirwadkar,
linux-ext4, linux-kernel
On Wed 29-05-24 10:20:30, Luis Henriques (SUSE) wrote:
> In the fast commit code there are a few places where tid_t variables are
> being compared without taking into account the fact that these sequence
> numbers may wrap. Fix this issue by using the helper functions tid_gt()
> and tid_geq().
>
> Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
Thanks! Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
> fs/ext4/fast_commit.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
> index 088bd509b116..30d312e16916 100644
> --- a/fs/ext4/fast_commit.c
> +++ b/fs/ext4/fast_commit.c
> @@ -353,7 +353,7 @@ void ext4_fc_mark_ineligible(struct super_block *sb, int reason, handle_t *handl
> read_unlock(&sbi->s_journal->j_state_lock);
> }
> spin_lock(&sbi->s_fc_lock);
> - if (sbi->s_fc_ineligible_tid < tid)
> + if (tid_gt(tid, sbi->s_fc_ineligible_tid))
> sbi->s_fc_ineligible_tid = tid;
> spin_unlock(&sbi->s_fc_lock);
> WARN_ON(reason >= EXT4_FC_REASON_MAX);
> @@ -1207,7 +1207,7 @@ int ext4_fc_commit(journal_t *journal, tid_t commit_tid)
> if (ret == -EALREADY) {
> /* There was an ongoing commit, check if we need to restart */
> if (atomic_read(&sbi->s_fc_subtid) <= subtid &&
> - commit_tid > journal->j_commit_sequence)
> + tid_gt(commit_tid, journal->j_commit_sequence))
> goto restart_fc;
> ext4_fc_update_stats(sb, EXT4_FC_STATUS_SKIPPED, 0, 0,
> commit_tid);
> @@ -1282,7 +1282,7 @@ static void ext4_fc_cleanup(journal_t *journal, int full, tid_t tid)
> list_del_init(&iter->i_fc_list);
> ext4_clear_inode_state(&iter->vfs_inode,
> EXT4_STATE_FC_COMMITTING);
> - if (iter->i_sync_tid <= tid) {
> + if (tid_geq(tid, iter->i_sync_tid)) {
> ext4_fc_reset_inode(&iter->vfs_inode);
> } else {
> /*
> @@ -1322,7 +1322,7 @@ static void ext4_fc_cleanup(journal_t *journal, int full, tid_t tid)
> list_splice_init(&sbi->s_fc_q[FC_Q_STAGING],
> &sbi->s_fc_q[FC_Q_MAIN]);
>
> - if (tid >= sbi->s_fc_ineligible_tid) {
> + if (tid_geq(tid, sbi->s_fc_ineligible_tid)) {
> sbi->s_fc_ineligible_tid = 0;
> ext4_clear_mount_flag(sb, EXT4_MF_FC_INELIGIBLE);
> }
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 2/2] ext4: fix possible tid_t sequence overflows
2024-05-29 9:51 ` Jan Kara
@ 2024-05-29 16:51 ` harshad shirwadkar
0 siblings, 0 replies; 14+ messages in thread
From: harshad shirwadkar @ 2024-05-29 16:51 UTC (permalink / raw)
To: Jan Kara
Cc: Luis Henriques (SUSE), Theodore Ts'o, Andreas Dilger,
linux-ext4, linux-kernel
Looks good, thanks for the patch!
Reviewed-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
On Wed, May 29, 2024 at 2:51 AM Jan Kara <jack@suse.cz> wrote:
>
> On Wed 29-05-24 10:20:30, Luis Henriques (SUSE) wrote:
> > In the fast commit code there are a few places where tid_t variables are
> > being compared without taking into account the fact that these sequence
> > numbers may wrap. Fix this issue by using the helper functions tid_gt()
> > and tid_geq().
> >
> > Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
>
> Thanks! Feel free to add:
>
> Reviewed-by: Jan Kara <jack@suse.cz>
>
> Honza
>
> > ---
> > fs/ext4/fast_commit.c | 8 ++++----
> > 1 file changed, 4 insertions(+), 4 deletions(-)
> >
> > diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
> > index 088bd509b116..30d312e16916 100644
> > --- a/fs/ext4/fast_commit.c
> > +++ b/fs/ext4/fast_commit.c
> > @@ -353,7 +353,7 @@ void ext4_fc_mark_ineligible(struct super_block *sb, int reason, handle_t *handl
> > read_unlock(&sbi->s_journal->j_state_lock);
> > }
> > spin_lock(&sbi->s_fc_lock);
> > - if (sbi->s_fc_ineligible_tid < tid)
> > + if (tid_gt(tid, sbi->s_fc_ineligible_tid))
> > sbi->s_fc_ineligible_tid = tid;
> > spin_unlock(&sbi->s_fc_lock);
> > WARN_ON(reason >= EXT4_FC_REASON_MAX);
> > @@ -1207,7 +1207,7 @@ int ext4_fc_commit(journal_t *journal, tid_t commit_tid)
> > if (ret == -EALREADY) {
> > /* There was an ongoing commit, check if we need to restart */
> > if (atomic_read(&sbi->s_fc_subtid) <= subtid &&
> > - commit_tid > journal->j_commit_sequence)
> > + tid_gt(commit_tid, journal->j_commit_sequence))
> > goto restart_fc;
> > ext4_fc_update_stats(sb, EXT4_FC_STATUS_SKIPPED, 0, 0,
> > commit_tid);
> > @@ -1282,7 +1282,7 @@ static void ext4_fc_cleanup(journal_t *journal, int full, tid_t tid)
> > list_del_init(&iter->i_fc_list);
> > ext4_clear_inode_state(&iter->vfs_inode,
> > EXT4_STATE_FC_COMMITTING);
> > - if (iter->i_sync_tid <= tid) {
> > + if (tid_geq(tid, iter->i_sync_tid)) {
> > ext4_fc_reset_inode(&iter->vfs_inode);
> > } else {
> > /*
> > @@ -1322,7 +1322,7 @@ static void ext4_fc_cleanup(journal_t *journal, int full, tid_t tid)
> > list_splice_init(&sbi->s_fc_q[FC_Q_STAGING],
> > &sbi->s_fc_q[FC_Q_MAIN]);
> >
> > - if (tid >= sbi->s_fc_ineligible_tid) {
> > + if (tid_geq(tid, sbi->s_fc_ineligible_tid)) {
> > sbi->s_fc_ineligible_tid = 0;
> > ext4_clear_mount_flag(sb, EXT4_MF_FC_INELIGIBLE);
> > }
> >
> --
> Jan Kara <jack@suse.com>
> SUSE Labs, CR
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 1/2] ext4: fix fast commit inode enqueueing during a full journal commit
2024-05-29 9:50 ` Jan Kara
@ 2024-05-29 16:52 ` harshad shirwadkar
0 siblings, 0 replies; 14+ messages in thread
From: harshad shirwadkar @ 2024-05-29 16:52 UTC (permalink / raw)
To: Jan Kara
Cc: Luis Henriques (SUSE), Theodore Ts'o, Andreas Dilger,
linux-ext4, linux-kernel
Looks good!
Reviewed-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
On Wed, May 29, 2024 at 2:50 AM Jan Kara <jack@suse.cz> wrote:
>
> On Wed 29-05-24 10:20:29, Luis Henriques (SUSE) wrote:
> > When a full journal commit is on-going, any fast commit has to be enqueued
> > into a different queue: FC_Q_STAGING instead of FC_Q_MAIN. This enqueueing
> > is done only once, i.e. if an inode is already queued in a previous fast
> > commit entry it won't be enqueued again. However, if a full commit starts
> > _after_ the inode is enqueued into FC_Q_MAIN, the next fast commit needs to
> > be done into FC_Q_STAGING. And this is not being done in function
> > ext4_fc_track_template().
> >
> > This patch fixes the issue by re-enqueuing an inode into the STAGING queue
> > during the fast commit clean-up callback if it has a tid (i_sync_tid)
> > greater than the one being handled. The STAGING queue will then be spliced
> > back into MAIN.
> >
> > This bug was found using fstest generic/047. This test creates several 32k
> > bytes files, sync'ing each of them after it's creation, and then shutting
> > down the filesystem. Some data may be loss in this operation; for example a
> > file may have it's size truncated to zero.
> >
> > Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
>
> Looks good to me! Feel free to add:
>
> Reviewed-by: Jan Kara <jack@suse.cz>
>
> Just a typo correction below.
>
> > diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
> > index 87c009e0c59a..088bd509b116 100644
> > --- a/fs/ext4/fast_commit.c
> > +++ b/fs/ext4/fast_commit.c
> > @@ -1282,8 +1282,17 @@ static void ext4_fc_cleanup(journal_t *journal, int full, tid_t tid)
> > list_del_init(&iter->i_fc_list);
> > ext4_clear_inode_state(&iter->vfs_inode,
> > EXT4_STATE_FC_COMMITTING);
> > - if (iter->i_sync_tid <= tid)
> > + if (iter->i_sync_tid <= tid) {
> > ext4_fc_reset_inode(&iter->vfs_inode);
> > + } else {
> > + /*
> > + * re-enqueue inode into STAGING, which later will be
> > + * splice back into MAIN
> ^^^ spliced
>
> > + */
> > + list_add_tail(&EXT4_I(&iter->vfs_inode)->i_fc_list,
> > + &sbi->s_fc_q[FC_Q_STAGING]);
> > + }
> > +
>
> Honza
> --
> Jan Kara <jack@suse.com>
> SUSE Labs, CR
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 0/2] ext4: fix fast commit inode enqueueing during a full journal commit
2024-05-29 9:20 [PATCH v3 0/2] ext4: fix fast commit inode enqueueing during a full journal commit Luis Henriques (SUSE)
2024-05-29 9:20 ` [PATCH v3 1/2] " Luis Henriques (SUSE)
2024-05-29 9:20 ` [PATCH v3 2/2] ext4: fix possible tid_t sequence overflows Luis Henriques (SUSE)
@ 2024-06-27 13:54 ` Luis Henriques
2024-06-27 14:58 ` Theodore Ts'o
2024-07-11 2:35 ` Theodore Ts'o
3 siblings, 1 reply; 14+ messages in thread
From: Luis Henriques @ 2024-06-27 13:54 UTC (permalink / raw)
To: Theodore Ts'o, Andreas Dilger
Cc: Jan Kara, Harshad Shirwadkar, linux-ext4, linux-kernel
On Wed, May 29 2024, Luis Henriques (SUSE) wrote:
> Hi!
>
> Here's v3 of this fix to the fast commit enqueuing bug triggered by fstest
> generic/047. This version simplifies the previous patch version by re-using
> the i_sync_tid field in struct ext4_inode_info instead of adding a new one.
>
> The extra patch includes a few extra fixes to the tid_t type handling. Jan
> brought to my attention the fact that this sequence number may wrap, and I
> quickly found a few places in the code where the tid_geq() and tid_gt()
> helpers had to be used.
>
> Again, please note that this fix requires [1] to be applied too.
>
> [1] https://lore.kernel.org/all/20240515082857.32730-1-luis.henriques@linux.dev
>
> Luis Henriques (SUSE) (2):
> ext4: fix fast commit inode enqueueing during a full journal commit
> ext4: fix possible tid_t sequence overflows
Gentle ping... Has this fell through the cracks?
Cheers,
--
Luis
>
> fs/ext4/fast_commit.c | 17 +++++++++++++----
> 1 file changed, 13 insertions(+), 4 deletions(-)
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 0/2] ext4: fix fast commit inode enqueueing during a full journal commit
2024-06-27 13:54 ` [PATCH v3 0/2] ext4: fix fast commit inode enqueueing during a full journal commit Luis Henriques
@ 2024-06-27 14:58 ` Theodore Ts'o
2024-06-27 15:10 ` Luis Henriques
0 siblings, 1 reply; 14+ messages in thread
From: Theodore Ts'o @ 2024-06-27 14:58 UTC (permalink / raw)
To: Luis Henriques
Cc: Andreas Dilger, Jan Kara, Harshad Shirwadkar, linux-ext4,
linux-kernel
On Thu, Jun 27, 2024 at 02:54:39PM +0100, Luis Henriques wrote:
> On Wed, May 29 2024, Luis Henriques (SUSE) wrote:
>
> > Hi!
> >
> > Here's v3 of this fix to the fast commit enqueuing bug triggered by fstest
> > generic/047. This version simplifies the previous patch version by re-using
> > the i_sync_tid field in struct ext4_inode_info instead of adding a new one.
> >
> > The extra patch includes a few extra fixes to the tid_t type handling. Jan
> > brought to my attention the fact that this sequence number may wrap, and I
> > quickly found a few places in the code where the tid_geq() and tid_gt()
> > helpers had to be used.
> >
> > Again, please note that this fix requires [1] to be applied too.
> >
> > [1] https://lore.kernel.org/all/20240515082857.32730-1-luis.henriques@linux.dev
> >
> > Luis Henriques (SUSE) (2):
> > ext4: fix fast commit inode enqueueing during a full journal commit
> > ext4: fix possible tid_t sequence overflows
>
> Gentle ping... Has this fell through the cracks?
Sorry, I'm still catching up after being on vacation. There is a
batch of commits which I've reviewed (up to May 17th) which is
currently undergoing testing. So that doesn't include this patch yet,
but it's on the list of patches to be reviewed at
patchworks.ozlabs.org/project/linux-ext4 so it won't fall through the
cracks.
- Ted
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 0/2] ext4: fix fast commit inode enqueueing during a full journal commit
2024-06-27 14:58 ` Theodore Ts'o
@ 2024-06-27 15:10 ` Luis Henriques
0 siblings, 0 replies; 14+ messages in thread
From: Luis Henriques @ 2024-06-27 15:10 UTC (permalink / raw)
To: Theodore Ts'o
Cc: Luis Henriques, Andreas Dilger, Jan Kara, Harshad Shirwadkar,
linux-ext4, linux-kernel
On Thu, Jun 27 2024, Theodore Ts'o wrote:
> On Thu, Jun 27, 2024 at 02:54:39PM +0100, Luis Henriques wrote:
>> On Wed, May 29 2024, Luis Henriques (SUSE) wrote:
>>
>> > Hi!
>> >
>> > Here's v3 of this fix to the fast commit enqueuing bug triggered by fstest
>> > generic/047. This version simplifies the previous patch version by re-using
>> > the i_sync_tid field in struct ext4_inode_info instead of adding a new one.
>> >
>> > The extra patch includes a few extra fixes to the tid_t type handling. Jan
>> > brought to my attention the fact that this sequence number may wrap, and I
>> > quickly found a few places in the code where the tid_geq() and tid_gt()
>> > helpers had to be used.
>> >
>> > Again, please note that this fix requires [1] to be applied too.
>> >
>> > [1] https://lore.kernel.org/all/20240515082857.32730-1-luis.henriques@linux.dev
>> >
>> > Luis Henriques (SUSE) (2):
>> > ext4: fix fast commit inode enqueueing during a full journal commit
>> > ext4: fix possible tid_t sequence overflows
>>
>> Gentle ping... Has this fell through the cracks?
>
> Sorry, I'm still catching up after being on vacation. There is a
> batch of commits which I've reviewed (up to May 17th) which is
> currently undergoing testing. So that doesn't include this patch yet,
> but it's on the list of patches to be reviewed at
> patchworks.ozlabs.org/project/linux-ext4 so it won't fall through the
> cracks.
Awesome, thanks for the update. And sorry for being impatient.
/me goes back under his rock.
Cheers,
--
Luís
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 1/2] ext4: fix fast commit inode enqueueing during a full journal commit
2024-05-29 9:20 ` [PATCH v3 1/2] " Luis Henriques (SUSE)
2024-05-29 9:50 ` Jan Kara
@ 2024-07-09 3:59 ` Theodore Ts'o
2024-07-09 14:39 ` Luis Henriques
1 sibling, 1 reply; 14+ messages in thread
From: Theodore Ts'o @ 2024-07-09 3:59 UTC (permalink / raw)
To: Luis Henriques (SUSE)
Cc: Andreas Dilger, Jan Kara, Harshad Shirwadkar, linux-ext4,
linux-kernel
On Wed, May 29, 2024 at 10:20:29AM +0100, Luis Henriques (SUSE) wrote:
> When a full journal commit is on-going, any fast commit has to be enqueued
> into a different queue: FC_Q_STAGING instead of FC_Q_MAIN. This enqueueing
> is done only once, i.e. if an inode is already queued in a previous fast
> commit entry it won't be enqueued again. However, if a full commit starts
> _after_ the inode is enqueued into FC_Q_MAIN, the next fast commit needs to
> be done into FC_Q_STAGING. And this is not being done in function
> ext4_fc_track_template().
>
> This patch fixes the issue by re-enqueuing an inode into the STAGING queue
> during the fast commit clean-up callback if it has a tid (i_sync_tid)
> greater than the one being handled. The STAGING queue will then be spliced
> back into MAIN.
>
> This bug was found using fstest generic/047. This test creates several 32k
> bytes files, sync'ing each of them after it's creation, and then shutting
> down the filesystem. Some data may be loss in this operation; for example a
> file may have it's size truncated to zero.
>
> Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
This patch is causing a regression for the test generic/472
generic/496 generic/643 if fast_commit is enabled. So using the
ext4/adv or ext4/fast_commit configuration, e.g:
% kvm-xfstests -c ext4/fast_commit generic/472 generic/496 generic/643
For all of these test, the failures seem to involve the swapon command
erroring out:
--- tests/generic/496.out 2024-06-13 18:57:39.000000000 -0400
+++ /results/ext4/results-fast_commit/generic/496.out.bad 2024-07-08 23:46:39.720
@@ -1,3 +1,4 @@
QA output created by 496
fallocate swap
mixed swap
+swapon: Invalid argument
...
but it's unclear why this patch would affect swapon.
I've never been able to see generic/047 failure in any of my ext4/dev
testing, nor in any of my daily fs-next CI testing. So for that
reason, I'm going to drop this patch from my tree.
The second patch in this series appears to be independent at least
from a logical perspective --- although a minor change is needed to
resolve a merge conflict after dropping this change.
Luis, Harshad, could you look in this failure and then resubmit once
it's been fixed? Thanks!! Also, Luis, can you give more details
about the generic/047 failure that you had seen? Is it one of those
flaky tests where you need to run the test dozens or hundreds of time
to see the failure?
Many thanks!!
- Ted
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 1/2] ext4: fix fast commit inode enqueueing during a full journal commit
2024-07-09 3:59 ` Theodore Ts'o
@ 2024-07-09 14:39 ` Luis Henriques
2024-07-10 10:32 ` Luis Henriques
0 siblings, 1 reply; 14+ messages in thread
From: Luis Henriques @ 2024-07-09 14:39 UTC (permalink / raw)
To: Theodore Ts'o
Cc: Andreas Dilger, Jan Kara, Harshad Shirwadkar, linux-ext4,
linux-kernel
On Mon, Jul 08 2024, Theodore Ts'o wrote:
> On Wed, May 29, 2024 at 10:20:29AM +0100, Luis Henriques (SUSE) wrote:
>> When a full journal commit is on-going, any fast commit has to be enqueued
>> into a different queue: FC_Q_STAGING instead of FC_Q_MAIN. This enqueueing
>> is done only once, i.e. if an inode is already queued in a previous fast
>> commit entry it won't be enqueued again. However, if a full commit starts
>> _after_ the inode is enqueued into FC_Q_MAIN, the next fast commit needs to
>> be done into FC_Q_STAGING. And this is not being done in function
>> ext4_fc_track_template().
>>
>> This patch fixes the issue by re-enqueuing an inode into the STAGING queue
>> during the fast commit clean-up callback if it has a tid (i_sync_tid)
>> greater than the one being handled. The STAGING queue will then be spliced
>> back into MAIN.
>>
>> This bug was found using fstest generic/047. This test creates several 32k
>> bytes files, sync'ing each of them after it's creation, and then shutting
>> down the filesystem. Some data may be loss in this operation; for example a
>> file may have it's size truncated to zero.
>>
>> Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
>
> This patch is causing a regression for the test generic/472
> generic/496 generic/643 if fast_commit is enabled. So using the
> ext4/adv or ext4/fast_commit configuration, e.g:
>
> % kvm-xfstests -c ext4/fast_commit generic/472 generic/496 generic/643
>
> For all of these test, the failures seem to involve the swapon command
> erroring out:
>
> --- tests/generic/496.out 2024-06-13 18:57:39.000000000 -0400
> +++ /results/ext4/results-fast_commit/generic/496.out.bad 2024-07-08 23:46:39.720
> @@ -1,3 +1,4 @@
> QA output created by 496
> fallocate swap
> mixed swap
> +swapon: Invalid argument
> ...
>
> but it's unclear why this patch would affect swapon.
OK, that's... embarrassing. I should have caught these failures :-(
> I've never been able to see generic/047 failure in any of my ext4/dev
> testing, nor in any of my daily fs-next CI testing. So for that
> reason, I'm going to drop this patch from my tree.
There's nothing special about my test environment. I can reproduce the
generic/047 failure (although not 100% of the times) by running it
manually in a virtme-ng test environment, using MKFS_OPTIONS="-O fast_commit".
Here's what I see when running it:
FSTYP -- ext4
PLATFORM -- Linux/x86_64 virtme-ng 6.10.0-rc7+ #269 SMP PREEMPT_DYNAMIC Tue Jul 9 14:24:22 WEST 2024
MKFS_OPTIONS -- -F -O fast_commit /dev/vdb1
MOUNT_OPTIONS -- -o acl,user_xattr /dev/vdb1 /tmp/mnt/scratch
generic/047 162s ... - output mismatch (see [...]/testing/xfstests-dev/results//generic/047.out.bad)
--- tests/generic/047.out 2021-01-11 12:08:14.972458324 +0000
+++ [...]/testing/xfstests-dev/results//generic/047.out.bad 2024-07-09 14:28:36.626435948 +0100
@@ -1 +1,2 @@
QA output created by 047
+file /tmp/mnt/scratch/944 has incorrect size - fsync failed
...
(Run 'diff -u [...]/testing/xfstests-dev/tests/generic/047.out [...]/testing/xfstests-dev/results//generic/047.out.bad' to see the entire diff)
Ran: generic/047
Failures: generic/047
Failed 1 of 1 tests
> The second patch in this series appears to be independent at least
> from a logical perspective --- although a minor change is needed to
> resolve a merge conflict after dropping this change.
>
> Luis, Harshad, could you look in this failure and then resubmit once
> it's been fixed? Thanks!! Also, Luis, can you give more details
> about the generic/047 failure that you had seen? Is it one of those
> flaky tests where you need to run the test dozens or hundreds of time
> to see the failure?
So, I've done some quick tests, but I'll need some more time to dig into
it. And this is what I _think_ it's happening:
When activating a swap file, the kernel forces an fsync, calling
ext4_sync_file() which will then call ext4_fc_commit() and, eventually,
the ext4_fc_cleanup().
With this patch an inode may be re-enqueued into the STAGING queue and
then spliced back into MAIN; and that's exactly what I see happening.
Later, still on the swap activation path, ext4_set_iomap() will be called
and will do this:
if (ext4_inode_datasync_dirty(inode) ||
offset + length > i_size_read(inode))
iomap->flags |= IOMAP_F_DIRTY;
'ext4_inode_datasync_dirty()' will be true because '->i_fc_list' is not
empty. And that's why the swapoff will fail.
Anyway, I'll try to figure out what's missing here (or what's wrong with
my patch).
Cheers,
--
Luís
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 1/2] ext4: fix fast commit inode enqueueing during a full journal commit
2024-07-09 14:39 ` Luis Henriques
@ 2024-07-10 10:32 ` Luis Henriques
0 siblings, 0 replies; 14+ messages in thread
From: Luis Henriques @ 2024-07-10 10:32 UTC (permalink / raw)
To: Theodore Ts'o
Cc: Andreas Dilger, Jan Kara, Harshad Shirwadkar, linux-ext4,
linux-kernel
On Tue, Jul 09 2024, Luis Henriques wrote:
> On Mon, Jul 08 2024, Theodore Ts'o wrote:
>
>> On Wed, May 29, 2024 at 10:20:29AM +0100, Luis Henriques (SUSE) wrote:
>>> When a full journal commit is on-going, any fast commit has to be enqueued
>>> into a different queue: FC_Q_STAGING instead of FC_Q_MAIN. This enqueueing
>>> is done only once, i.e. if an inode is already queued in a previous fast
>>> commit entry it won't be enqueued again. However, if a full commit starts
>>> _after_ the inode is enqueued into FC_Q_MAIN, the next fast commit needs to
>>> be done into FC_Q_STAGING. And this is not being done in function
>>> ext4_fc_track_template().
>>>
>>> This patch fixes the issue by re-enqueuing an inode into the STAGING queue
>>> during the fast commit clean-up callback if it has a tid (i_sync_tid)
>>> greater than the one being handled. The STAGING queue will then be spliced
>>> back into MAIN.
>>>
>>> This bug was found using fstest generic/047. This test creates several 32k
>>> bytes files, sync'ing each of them after it's creation, and then shutting
>>> down the filesystem. Some data may be loss in this operation; for example a
>>> file may have it's size truncated to zero.
>>>
>>> Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
>>
>> This patch is causing a regression for the test generic/472
>> generic/496 generic/643 if fast_commit is enabled. So using the
>> ext4/adv or ext4/fast_commit configuration, e.g:
>>
>> % kvm-xfstests -c ext4/fast_commit generic/472 generic/496 generic/643
>>
>> For all of these test, the failures seem to involve the swapon command
>> erroring out:
>>
>> --- tests/generic/496.out 2024-06-13 18:57:39.000000000 -0400
>> +++ /results/ext4/results-fast_commit/generic/496.out.bad 2024-07-08 23:46:39.720
>> @@ -1,3 +1,4 @@
>> QA output created by 496
>> fallocate swap
>> mixed swap
>> +swapon: Invalid argument
>> ...
>>
>> but it's unclear why this patch would affect swapon.
>
> OK, that's... embarrassing. I should have caught these failures :-(
>
>> I've never been able to see generic/047 failure in any of my ext4/dev
>> testing, nor in any of my daily fs-next CI testing. So for that
>> reason, I'm going to drop this patch from my tree.
>
> There's nothing special about my test environment. I can reproduce the
> generic/047 failure (although not 100% of the times) by running it
> manually in a virtme-ng test environment, using MKFS_OPTIONS="-O fast_commit".
> Here's what I see when running it:
>
> FSTYP -- ext4
> PLATFORM -- Linux/x86_64 virtme-ng 6.10.0-rc7+ #269 SMP PREEMPT_DYNAMIC Tue Jul 9 14:24:22 WEST 2024
> MKFS_OPTIONS -- -F -O fast_commit /dev/vdb1
> MOUNT_OPTIONS -- -o acl,user_xattr /dev/vdb1 /tmp/mnt/scratch
>
> generic/047 162s ... - output mismatch (see [...]/testing/xfstests-dev/results//generic/047.out.bad)
> --- tests/generic/047.out 2021-01-11 12:08:14.972458324 +0000
> +++ [...]/testing/xfstests-dev/results//generic/047.out.bad 2024-07-09 14:28:36.626435948 +0100
> @@ -1 +1,2 @@
> QA output created by 047
> +file /tmp/mnt/scratch/944 has incorrect size - fsync failed
> ...
> (Run 'diff -u [...]/testing/xfstests-dev/tests/generic/047.out [...]/testing/xfstests-dev/results//generic/047.out.bad' to see the entire diff)
> Ran: generic/047
> Failures: generic/047
> Failed 1 of 1 tests
>
>> The second patch in this series appears to be independent at least
>> from a logical perspective --- although a minor change is needed to
>> resolve a merge conflict after dropping this change.
>>
>> Luis, Harshad, could you look in this failure and then resubmit once
>> it's been fixed? Thanks!! Also, Luis, can you give more details
>> about the generic/047 failure that you had seen? Is it one of those
>> flaky tests where you need to run the test dozens or hundreds of time
>> to see the failure?
>
>
> So, I've done some quick tests, but I'll need some more time to dig into
> it. And this is what I _think_ it's happening:
>
> When activating a swap file, the kernel forces an fsync, calling
> ext4_sync_file() which will then call ext4_fc_commit() and, eventually,
> the ext4_fc_cleanup().
>
> With this patch an inode may be re-enqueued into the STAGING queue and
> then spliced back into MAIN; and that's exactly what I see happening.
>
> Later, still on the swap activation path, ext4_set_iomap() will be called
> and will do this:
>
> if (ext4_inode_datasync_dirty(inode) ||
> offset + length > i_size_read(inode))
> iomap->flags |= IOMAP_F_DIRTY;
>
> 'ext4_inode_datasync_dirty()' will be true because '->i_fc_list' is not
> empty. And that's why the swapoff will fail.
>
> Anyway, I'll try to figure out what's missing here (or what's wrong with
> my patch).
I believe I found the issue with the patch. The ext4_fc_cleanup()
callback can be invoked in three different situations:
1) when there's a full commit
2) when there's a fc commit but with fallback to full commit
3) when there's a fc commit
For both 1) and 2) the cleanup callback will get a real 'tid' value;
however, for the regular fc commit 3) the 'tid' will be 0. And for those
cases the inode should not be enqueued back in enqueued into STAGING. See
below an updated diff with this fix.
Does this make sense? I'll run a few more tests before sending a v4.
Cheers
--
Luís
diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
index 87c009e0c59a..86d33741452a 100644
--- a/fs/ext4/fast_commit.c
+++ b/fs/ext4/fast_commit.c
@@ -1282,8 +1282,17 @@ static void ext4_fc_cleanup(journal_t *journal, int full, tid_t tid)
list_del_init(&iter->i_fc_list);
ext4_clear_inode_state(&iter->vfs_inode,
EXT4_STATE_FC_COMMITTING);
- if (iter->i_sync_tid <= tid)
+ if (iter->i_sync_tid <= tid) {
ext4_fc_reset_inode(&iter->vfs_inode);
+ } else if (tid) {
+ /*
+ * re-enqueue inode into STAGING, which later will be
+ * splice back into MAIN
+ */
+ list_add_tail(&EXT4_I(&iter->vfs_inode)->i_fc_list,
+ &sbi->s_fc_q[FC_Q_STAGING]);
+ }
+
/* Make sure EXT4_STATE_FC_COMMITTING bit is clear */
smp_mb();
#if (BITS_PER_LONG < 64)
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v3 0/2] ext4: fix fast commit inode enqueueing during a full journal commit
2024-05-29 9:20 [PATCH v3 0/2] ext4: fix fast commit inode enqueueing during a full journal commit Luis Henriques (SUSE)
` (2 preceding siblings ...)
2024-06-27 13:54 ` [PATCH v3 0/2] ext4: fix fast commit inode enqueueing during a full journal commit Luis Henriques
@ 2024-07-11 2:35 ` Theodore Ts'o
3 siblings, 0 replies; 14+ messages in thread
From: Theodore Ts'o @ 2024-07-11 2:35 UTC (permalink / raw)
To: Andreas Dilger, Jan Kara, Harshad Shirwadkar,
Luis Henriques (SUSE)
Cc: Theodore Ts'o, linux-ext4, linux-kernel
On Wed, 29 May 2024 10:20:28 +0100, Luis Henriques (SUSE) wrote:
> Here's v3 of this fix to the fast commit enqueuing bug triggered by fstest
> generic/047. This version simplifies the previous patch version by re-using
> the i_sync_tid field in struct ext4_inode_info instead of adding a new one.
>
> The extra patch includes a few extra fixes to the tid_t type handling. Jan
> brought to my attention the fact that this sequence number may wrap, and I
> quickly found a few places in the code where the tid_geq() and tid_gt()
> helpers had to be used.
>
> [...]
The second patch in this series was applied:
[2/2] ext4: fix possible tid_t sequence overflows
commit: 63469662cc45d41705f14b4648481d5d29cf5999
Best regards,
--
Theodore Ts'o <tytso@mit.edu>
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2024-07-11 2:36 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-29 9:20 [PATCH v3 0/2] ext4: fix fast commit inode enqueueing during a full journal commit Luis Henriques (SUSE)
2024-05-29 9:20 ` [PATCH v3 1/2] " Luis Henriques (SUSE)
2024-05-29 9:50 ` Jan Kara
2024-05-29 16:52 ` harshad shirwadkar
2024-07-09 3:59 ` Theodore Ts'o
2024-07-09 14:39 ` Luis Henriques
2024-07-10 10:32 ` Luis Henriques
2024-05-29 9:20 ` [PATCH v3 2/2] ext4: fix possible tid_t sequence overflows Luis Henriques (SUSE)
2024-05-29 9:51 ` Jan Kara
2024-05-29 16:51 ` harshad shirwadkar
2024-06-27 13:54 ` [PATCH v3 0/2] ext4: fix fast commit inode enqueueing during a full journal commit Luis Henriques
2024-06-27 14:58 ` Theodore Ts'o
2024-06-27 15:10 ` Luis Henriques
2024-07-11 2:35 ` Theodore Ts'o
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).