* [PATCH 0/4] ext4: fix incorrect tid assumptions
@ 2024-07-23 15:43 Luis Henriques (SUSE)
2024-07-23 15:43 ` [PATCH 1/4] ext4: fix incorrect tid assumption in ext4_wait_for_tail_page_commit() Luis Henriques (SUSE)
` (3 more replies)
0 siblings, 4 replies; 13+ messages in thread
From: Luis Henriques (SUSE) @ 2024-07-23 15:43 UTC (permalink / raw)
To: Theodore Ts'o, Andreas Dilger, Jan Kara, Harshad Shirwadkar
Cc: linux-ext4, linux-kernel, Luis Henriques (SUSE)
As discussed here [1], there are a few places in ext4 and jbd2 code where it
is assumed that a tid of '0' is not valid. Which isn't true.
This small patchset tries to fix (hopefully!) all these places. Jan Kara
had already identified the functions that needed to be fixed. I believe
that the only other issue is the handling of sbi->s_fc_ineligible_tid.
Each patch in this series fixes a single function; the last one also fixes
the sbi->s_fc_ineligible_tid handling.
[1] https://lore.kernel.org/all/20240716095201.o7kkrhfdy2bps3rw@quack3/
Luis Henriques (SUSE) (4):
ext4: fix incorrect tid assumption in ext4_wait_for_tail_page_commit()
ext4: fix incorrect tid assumption in __jbd2_log_wait_for_space()
ext4: fix incorrect tid assumption in
jbd2_journal_shrink_checkpoint_list()
ext4: fix incorrect tid assumption in ext4_fc_mark_ineligible()
fs/ext4/fast_commit.c | 15 +++++++++++----
fs/ext4/inode.c | 10 ++++++----
fs/jbd2/checkpoint.c | 15 +++++++++++----
3 files changed, 28 insertions(+), 12 deletions(-)
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 1/4] ext4: fix incorrect tid assumption in ext4_wait_for_tail_page_commit()
2024-07-23 15:43 [PATCH 0/4] ext4: fix incorrect tid assumptions Luis Henriques (SUSE)
@ 2024-07-23 15:43 ` Luis Henriques (SUSE)
2024-07-24 9:17 ` Jan Kara
2024-07-23 15:44 ` [PATCH 2/4] ext4: fix incorrect tid assumption in __jbd2_log_wait_for_space() Luis Henriques (SUSE)
` (2 subsequent siblings)
3 siblings, 1 reply; 13+ messages in thread
From: Luis Henriques (SUSE) @ 2024-07-23 15:43 UTC (permalink / raw)
To: Theodore Ts'o, Andreas Dilger, Jan Kara, Harshad Shirwadkar
Cc: linux-ext4, linux-kernel, Luis Henriques (SUSE)
Function ext4_wait_for_tail_page_commit() assumes that '0' is not a valid
value for transaction IDs, which is incorrect. Don't assume that and invoke
jbd2_log_wait_commit() if the journal had a committing transaction instead.
Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
---
fs/ext4/inode.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 941c1c0d5c6e..e65fc2086701 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -5279,8 +5279,9 @@ static void ext4_wait_for_tail_page_commit(struct inode *inode)
{
unsigned offset;
journal_t *journal = EXT4_SB(inode->i_sb)->s_journal;
- tid_t commit_tid = 0;
+ tid_t commit_tid;
int ret;
+ bool has_transaction = false;
offset = inode->i_size & (PAGE_SIZE - 1);
/*
@@ -5305,12 +5306,13 @@ static void ext4_wait_for_tail_page_commit(struct inode *inode)
folio_put(folio);
if (ret != -EBUSY)
return;
- commit_tid = 0;
read_lock(&journal->j_state_lock);
- if (journal->j_committing_transaction)
+ if (journal->j_committing_transaction) {
commit_tid = journal->j_committing_transaction->t_tid;
+ has_transaction = true;
+ }
read_unlock(&journal->j_state_lock);
- if (commit_tid)
+ if (has_transaction)
jbd2_log_wait_commit(journal, commit_tid);
}
}
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 2/4] ext4: fix incorrect tid assumption in __jbd2_log_wait_for_space()
2024-07-23 15:43 [PATCH 0/4] ext4: fix incorrect tid assumptions Luis Henriques (SUSE)
2024-07-23 15:43 ` [PATCH 1/4] ext4: fix incorrect tid assumption in ext4_wait_for_tail_page_commit() Luis Henriques (SUSE)
@ 2024-07-23 15:44 ` Luis Henriques (SUSE)
2024-07-24 9:20 ` Jan Kara
2024-07-23 15:44 ` [PATCH 3/4] ext4: fix incorrect tid assumption in jbd2_journal_shrink_checkpoint_list() Luis Henriques (SUSE)
2024-07-23 15:44 ` [PATCH 4/4] ext4: fix incorrect tid assumption in ext4_fc_mark_ineligible() Luis Henriques (SUSE)
3 siblings, 1 reply; 13+ messages in thread
From: Luis Henriques (SUSE) @ 2024-07-23 15:44 UTC (permalink / raw)
To: Theodore Ts'o, Andreas Dilger, Jan Kara, Harshad Shirwadkar
Cc: linux-ext4, linux-kernel, Luis Henriques (SUSE)
Function __jbd2_log_wait_for_space() assumes that '0' is not a valid value
for transaction IDs, which is incorrect. Don't assume that and invoke
jbd2_log_wait_commit() if the journal had a committing transaction instead.
Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
---
fs/jbd2/checkpoint.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c
index 951f78634adf..77bc522e6821 100644
--- a/fs/jbd2/checkpoint.c
+++ b/fs/jbd2/checkpoint.c
@@ -79,9 +79,12 @@ __releases(&journal->j_state_lock)
if (space_left < nblocks) {
int chkpt = journal->j_checkpoint_transactions != NULL;
tid_t tid = 0;
+ bool has_transaction = false;
- if (journal->j_committing_transaction)
+ if (journal->j_committing_transaction) {
tid = journal->j_committing_transaction->t_tid;
+ has_transaction = true;
+ }
spin_unlock(&journal->j_list_lock);
write_unlock(&journal->j_state_lock);
if (chkpt) {
@@ -89,7 +92,7 @@ __releases(&journal->j_state_lock)
} else if (jbd2_cleanup_journal_tail(journal) == 0) {
/* We were able to recover space; yay! */
;
- } else if (tid) {
+ } else if (has_transaction) {
/*
* jbd2_journal_commit_transaction() may want
* to take the checkpoint_mutex if JBD2_FLUSHED
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 3/4] ext4: fix incorrect tid assumption in jbd2_journal_shrink_checkpoint_list()
2024-07-23 15:43 [PATCH 0/4] ext4: fix incorrect tid assumptions Luis Henriques (SUSE)
2024-07-23 15:43 ` [PATCH 1/4] ext4: fix incorrect tid assumption in ext4_wait_for_tail_page_commit() Luis Henriques (SUSE)
2024-07-23 15:44 ` [PATCH 2/4] ext4: fix incorrect tid assumption in __jbd2_log_wait_for_space() Luis Henriques (SUSE)
@ 2024-07-23 15:44 ` Luis Henriques (SUSE)
2024-07-24 9:29 ` Jan Kara
2024-07-23 15:44 ` [PATCH 4/4] ext4: fix incorrect tid assumption in ext4_fc_mark_ineligible() Luis Henriques (SUSE)
3 siblings, 1 reply; 13+ messages in thread
From: Luis Henriques (SUSE) @ 2024-07-23 15:44 UTC (permalink / raw)
To: Theodore Ts'o, Andreas Dilger, Jan Kara, Harshad Shirwadkar
Cc: linux-ext4, linux-kernel, Luis Henriques (SUSE)
Function jbd2_journal_shrink_checkpoint_list() assumes that '0' is not a
valid value for transaction IDs, which is incorrect. Don't assume that and
use two extra boolean variables to control the loop iterations and keep
track of the first and last tid.
Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
---
fs/jbd2/checkpoint.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c
index 77bc522e6821..f5a594237b7a 100644
--- a/fs/jbd2/checkpoint.c
+++ b/fs/jbd2/checkpoint.c
@@ -410,6 +410,7 @@ unsigned long jbd2_journal_shrink_checkpoint_list(journal_t *journal,
tid_t tid = 0;
unsigned long nr_freed = 0;
unsigned long freed;
+ bool is_first = true, is_last = false;
again:
spin_lock(&journal->j_list_lock);
@@ -429,8 +430,10 @@ unsigned long jbd2_journal_shrink_checkpoint_list(journal_t *journal,
else
transaction = journal->j_checkpoint_transactions;
- if (!first_tid)
+ if (is_first) {
first_tid = transaction->t_tid;
+ is_first = false;
+ }
last_transaction = journal->j_checkpoint_transactions->t_cpprev;
next_transaction = transaction;
last_tid = last_transaction->t_tid;
@@ -455,12 +458,13 @@ unsigned long jbd2_journal_shrink_checkpoint_list(journal_t *journal,
} else {
journal->j_shrink_transaction = NULL;
next_tid = 0;
+ is_last = true;
}
spin_unlock(&journal->j_list_lock);
cond_resched();
- if (*nr_to_scan && next_tid)
+ if (*nr_to_scan && !is_last)
goto again;
out:
trace_jbd2_shrink_checkpoint_list(journal, first_tid, tid, last_tid,
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 4/4] ext4: fix incorrect tid assumption in ext4_fc_mark_ineligible()
2024-07-23 15:43 [PATCH 0/4] ext4: fix incorrect tid assumptions Luis Henriques (SUSE)
` (2 preceding siblings ...)
2024-07-23 15:44 ` [PATCH 3/4] ext4: fix incorrect tid assumption in jbd2_journal_shrink_checkpoint_list() Luis Henriques (SUSE)
@ 2024-07-23 15:44 ` Luis Henriques (SUSE)
2024-07-24 10:15 ` Jan Kara
3 siblings, 1 reply; 13+ messages in thread
From: Luis Henriques (SUSE) @ 2024-07-23 15:44 UTC (permalink / raw)
To: Theodore Ts'o, Andreas Dilger, Jan Kara, Harshad Shirwadkar
Cc: linux-ext4, linux-kernel, Luis Henriques (SUSE)
Function jbd2_journal_shrink_checkpoint_list() assumes that '0' is not a
valid value for transaction IDs, which is incorrect.
Furthermore, the sbi->s_fc_ineligible_tid handling also makes the same
assumption by being initialised to '0'. Fortunately, the sb flag
EXT4_MF_FC_INELIGIBLE can be used to check whether sbi->s_fc_ineligible_tid
has been previously set instead of comparing it with '0'.
Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
---
fs/ext4/fast_commit.c | 15 +++++++++++----
1 file changed, 11 insertions(+), 4 deletions(-)
diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
index 3926a05eceee..3e0793cfea38 100644
--- a/fs/ext4/fast_commit.c
+++ b/fs/ext4/fast_commit.c
@@ -339,22 +339,29 @@ void ext4_fc_mark_ineligible(struct super_block *sb, int reason, handle_t *handl
{
struct ext4_sb_info *sbi = EXT4_SB(sb);
tid_t tid;
+ bool has_transaction = true;
+ bool is_ineligible;
if (ext4_fc_disabled(sb))
return;
- ext4_set_mount_flag(sb, EXT4_MF_FC_INELIGIBLE);
if (handle && !IS_ERR(handle))
tid = handle->h_transaction->t_tid;
else {
read_lock(&sbi->s_journal->j_state_lock);
- tid = sbi->s_journal->j_running_transaction ?
- sbi->s_journal->j_running_transaction->t_tid : 0;
+ if (sbi->s_journal->j_running_transaction)
+ tid = sbi->s_journal->j_running_transaction->t_tid;
+ else
+ has_transaction = false;
read_unlock(&sbi->s_journal->j_state_lock);
}
spin_lock(&sbi->s_fc_lock);
- if (tid_gt(tid, sbi->s_fc_ineligible_tid))
+ is_ineligible = ext4_test_mount_flag(sb, EXT4_MF_FC_INELIGIBLE);
+ if (has_transaction &&
+ ((!is_ineligible) ||
+ (is_ineligible && tid_gt(tid, sbi->s_fc_ineligible_tid))))
sbi->s_fc_ineligible_tid = tid;
+ ext4_set_mount_flag(sb, EXT4_MF_FC_INELIGIBLE);
spin_unlock(&sbi->s_fc_lock);
WARN_ON(reason >= EXT4_FC_REASON_MAX);
sbi->s_fc_stats.fc_ineligible_reason_count[reason]++;
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 1/4] ext4: fix incorrect tid assumption in ext4_wait_for_tail_page_commit()
2024-07-23 15:43 ` [PATCH 1/4] ext4: fix incorrect tid assumption in ext4_wait_for_tail_page_commit() Luis Henriques (SUSE)
@ 2024-07-24 9:17 ` Jan Kara
2024-07-24 13:35 ` Luis Henriques
0 siblings, 1 reply; 13+ messages in thread
From: Jan Kara @ 2024-07-24 9:17 UTC (permalink / raw)
To: Luis Henriques (SUSE)
Cc: Theodore Ts'o, Andreas Dilger, Jan Kara, Harshad Shirwadkar,
linux-ext4, linux-kernel
On Tue 23-07-24 16:43:59, Luis Henriques (SUSE) wrote:
> Function ext4_wait_for_tail_page_commit() assumes that '0' is not a valid
> value for transaction IDs, which is incorrect. Don't assume that and invoke
> jbd2_log_wait_commit() if the journal had a committing transaction instead.
>
> Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
> ---
> fs/ext4/inode.c | 10 ++++++----
> 1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 941c1c0d5c6e..e65fc2086701 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -5279,8 +5279,9 @@ static void ext4_wait_for_tail_page_commit(struct inode *inode)
> {
> unsigned offset;
> journal_t *journal = EXT4_SB(inode->i_sb)->s_journal;
> - tid_t commit_tid = 0;
> + tid_t commit_tid;
> int ret;
> + bool has_transaction = false;
>
> offset = inode->i_size & (PAGE_SIZE - 1);
> /*
> @@ -5305,12 +5306,13 @@ static void ext4_wait_for_tail_page_commit(struct inode *inode)
> folio_put(folio);
> if (ret != -EBUSY)
> return;
> - commit_tid = 0;
We should set "has_transaction = false" here to make things work properly
when looping... Otherwise looks good.
Honza
> read_lock(&journal->j_state_lock);
> - if (journal->j_committing_transaction)
> + if (journal->j_committing_transaction) {
> commit_tid = journal->j_committing_transaction->t_tid;
> + has_transaction = true;
> + }
> read_unlock(&journal->j_state_lock);
> - if (commit_tid)
> + if (has_transaction)
> jbd2_log_wait_commit(journal, commit_tid);
> }
> }
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/4] ext4: fix incorrect tid assumption in __jbd2_log_wait_for_space()
2024-07-23 15:44 ` [PATCH 2/4] ext4: fix incorrect tid assumption in __jbd2_log_wait_for_space() Luis Henriques (SUSE)
@ 2024-07-24 9:20 ` Jan Kara
0 siblings, 0 replies; 13+ messages in thread
From: Jan Kara @ 2024-07-24 9:20 UTC (permalink / raw)
To: Luis Henriques (SUSE)
Cc: Theodore Ts'o, Andreas Dilger, Jan Kara, Harshad Shirwadkar,
linux-ext4, linux-kernel
On Tue 23-07-24 16:44:00, Luis Henriques (SUSE) wrote:
> Function __jbd2_log_wait_for_space() assumes that '0' is not a valid value
> for transaction IDs, which is incorrect. Don't assume that and invoke
> jbd2_log_wait_commit() if the journal had a committing transaction instead.
>
> Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
Looks good. Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
> fs/jbd2/checkpoint.c | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c
> index 951f78634adf..77bc522e6821 100644
> --- a/fs/jbd2/checkpoint.c
> +++ b/fs/jbd2/checkpoint.c
> @@ -79,9 +79,12 @@ __releases(&journal->j_state_lock)
> if (space_left < nblocks) {
> int chkpt = journal->j_checkpoint_transactions != NULL;
> tid_t tid = 0;
> + bool has_transaction = false;
>
> - if (journal->j_committing_transaction)
> + if (journal->j_committing_transaction) {
> tid = journal->j_committing_transaction->t_tid;
> + has_transaction = true;
> + }
> spin_unlock(&journal->j_list_lock);
> write_unlock(&journal->j_state_lock);
> if (chkpt) {
> @@ -89,7 +92,7 @@ __releases(&journal->j_state_lock)
> } else if (jbd2_cleanup_journal_tail(journal) == 0) {
> /* We were able to recover space; yay! */
> ;
> - } else if (tid) {
> + } else if (has_transaction) {
> /*
> * jbd2_journal_commit_transaction() may want
> * to take the checkpoint_mutex if JBD2_FLUSHED
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 3/4] ext4: fix incorrect tid assumption in jbd2_journal_shrink_checkpoint_list()
2024-07-23 15:44 ` [PATCH 3/4] ext4: fix incorrect tid assumption in jbd2_journal_shrink_checkpoint_list() Luis Henriques (SUSE)
@ 2024-07-24 9:29 ` Jan Kara
2024-07-24 13:38 ` Luis Henriques
0 siblings, 1 reply; 13+ messages in thread
From: Jan Kara @ 2024-07-24 9:29 UTC (permalink / raw)
To: Luis Henriques (SUSE)
Cc: Theodore Ts'o, Andreas Dilger, Jan Kara, Harshad Shirwadkar,
linux-ext4, linux-kernel
On Tue 23-07-24 16:44:01, Luis Henriques (SUSE) wrote:
> Function jbd2_journal_shrink_checkpoint_list() assumes that '0' is not a
> valid value for transaction IDs, which is incorrect. Don't assume that and
> use two extra boolean variables to control the loop iterations and keep
> track of the first and last tid.
>
> Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
> ---
> fs/jbd2/checkpoint.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c
> index 77bc522e6821..f5a594237b7a 100644
> --- a/fs/jbd2/checkpoint.c
> +++ b/fs/jbd2/checkpoint.c
> @@ -410,6 +410,7 @@ unsigned long jbd2_journal_shrink_checkpoint_list(journal_t *journal,
> tid_t tid = 0;
> unsigned long nr_freed = 0;
> unsigned long freed;
> + bool is_first = true, is_last = false;
>
> again:
> spin_lock(&journal->j_list_lock);
> @@ -429,8 +430,10 @@ unsigned long jbd2_journal_shrink_checkpoint_list(journal_t *journal,
> else
> transaction = journal->j_checkpoint_transactions;
>
> - if (!first_tid)
> + if (is_first) {
> first_tid = transaction->t_tid;
> + is_first = false;
> + }
> last_transaction = journal->j_checkpoint_transactions->t_cpprev;
> next_transaction = transaction;
> last_tid = last_transaction->t_tid;
> @@ -455,12 +458,13 @@ unsigned long jbd2_journal_shrink_checkpoint_list(journal_t *journal,
> } else {
> journal->j_shrink_transaction = NULL;
> next_tid = 0;
> + is_last = true;
> }
>
> spin_unlock(&journal->j_list_lock);
> cond_resched();
>
> - if (*nr_to_scan && next_tid)
> + if (*nr_to_scan && !is_last)
I'd make this:
if (*nr_to_scan && journal->j_shrink_transaction)
goto again;
and just remove is_last. Also we might rename is_first to first_set? At
least to me it would be more comprehensible. Thanks!
Honza
> goto again;
> out:
> trace_jbd2_shrink_checkpoint_list(journal, first_tid, tid, last_tid,
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 4/4] ext4: fix incorrect tid assumption in ext4_fc_mark_ineligible()
2024-07-23 15:44 ` [PATCH 4/4] ext4: fix incorrect tid assumption in ext4_fc_mark_ineligible() Luis Henriques (SUSE)
@ 2024-07-24 10:15 ` Jan Kara
2024-07-24 14:02 ` Luis Henriques
0 siblings, 1 reply; 13+ messages in thread
From: Jan Kara @ 2024-07-24 10:15 UTC (permalink / raw)
To: Luis Henriques (SUSE)
Cc: Theodore Ts'o, Andreas Dilger, Jan Kara, Harshad Shirwadkar,
linux-ext4, linux-kernel
On Tue 23-07-24 16:44:02, Luis Henriques (SUSE) wrote:
> Function jbd2_journal_shrink_checkpoint_list() assumes that '0' is not a
> valid value for transaction IDs, which is incorrect.
>
> Furthermore, the sbi->s_fc_ineligible_tid handling also makes the same
> assumption by being initialised to '0'. Fortunately, the sb flag
> EXT4_MF_FC_INELIGIBLE can be used to check whether sbi->s_fc_ineligible_tid
> has been previously set instead of comparing it with '0'.
>
> Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
Just one style nit below, otherwise looks good. Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
BTW, the ineligibility handling looks flaky to me, in particular the cases
where we call ext4_fc_mark_ineligible() with NULL handle seem racy to me as
fastcommit can happen *before* we mark the filesystem as ineligible. But
that's not really related to your changes, they just made me look at that
code in detail and I couldn't resist complaining :)
> ---
> fs/ext4/fast_commit.c | 15 +++++++++++----
> 1 file changed, 11 insertions(+), 4 deletions(-)
>
> diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
> index 3926a05eceee..3e0793cfea38 100644
> --- a/fs/ext4/fast_commit.c
> +++ b/fs/ext4/fast_commit.c
> @@ -339,22 +339,29 @@ void ext4_fc_mark_ineligible(struct super_block *sb, int reason, handle_t *handl
> {
> struct ext4_sb_info *sbi = EXT4_SB(sb);
> tid_t tid;
> + bool has_transaction = true;
> + bool is_ineligible;
>
> if (ext4_fc_disabled(sb))
> return;
>
> - ext4_set_mount_flag(sb, EXT4_MF_FC_INELIGIBLE);
> if (handle && !IS_ERR(handle))
> tid = handle->h_transaction->t_tid;
> else {
> read_lock(&sbi->s_journal->j_state_lock);
> - tid = sbi->s_journal->j_running_transaction ?
> - sbi->s_journal->j_running_transaction->t_tid : 0;
> + if (sbi->s_journal->j_running_transaction)
> + tid = sbi->s_journal->j_running_transaction->t_tid;
> + else
> + has_transaction = false;
> read_unlock(&sbi->s_journal->j_state_lock);
> }
> spin_lock(&sbi->s_fc_lock);
> - if (tid_gt(tid, sbi->s_fc_ineligible_tid))
> + is_ineligible = ext4_test_mount_flag(sb, EXT4_MF_FC_INELIGIBLE);
> + if (has_transaction &&
> + ((!is_ineligible) ||
^^ these extra braces look strange
> + (is_ineligible && tid_gt(tid, sbi->s_fc_ineligible_tid))))
> sbi->s_fc_ineligible_tid = tid;
> + ext4_set_mount_flag(sb, EXT4_MF_FC_INELIGIBLE);
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/4] ext4: fix incorrect tid assumption in ext4_wait_for_tail_page_commit()
2024-07-24 9:17 ` Jan Kara
@ 2024-07-24 13:35 ` Luis Henriques
0 siblings, 0 replies; 13+ messages in thread
From: Luis Henriques @ 2024-07-24 13:35 UTC (permalink / raw)
To: Jan Kara
Cc: Theodore Ts'o, Andreas Dilger, Harshad Shirwadkar, linux-ext4,
linux-kernel
On Wed, Jul 24 2024, Jan Kara wrote:
> On Tue 23-07-24 16:43:59, Luis Henriques (SUSE) wrote:
>> Function ext4_wait_for_tail_page_commit() assumes that '0' is not a valid
>> value for transaction IDs, which is incorrect. Don't assume that and invoke
>> jbd2_log_wait_commit() if the journal had a committing transaction instead.
>>
>> Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
>> ---
>> fs/ext4/inode.c | 10 ++++++----
>> 1 file changed, 6 insertions(+), 4 deletions(-)
>>
>> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
>> index 941c1c0d5c6e..e65fc2086701 100644
>> --- a/fs/ext4/inode.c
>> +++ b/fs/ext4/inode.c
>> @@ -5279,8 +5279,9 @@ static void ext4_wait_for_tail_page_commit(struct inode *inode)
>> {
>> unsigned offset;
>> journal_t *journal = EXT4_SB(inode->i_sb)->s_journal;
>> - tid_t commit_tid = 0;
>> + tid_t commit_tid;
>> int ret;
>> + bool has_transaction = false;
>>
>> offset = inode->i_size & (PAGE_SIZE - 1);
>> /*
>> @@ -5305,12 +5306,13 @@ static void ext4_wait_for_tail_page_commit(struct inode *inode)
>> folio_put(folio);
>> if (ret != -EBUSY)
>> return;
>> - commit_tid = 0;
>
> We should set "has_transaction = false" here to make things work properly
> when looping... Otherwise looks good.
Ah! Good point. I'll fix that, thanks!
Cheers,
--
Luís
>
> Honza
>
>> read_lock(&journal->j_state_lock);
>> - if (journal->j_committing_transaction)
>> + if (journal->j_committing_transaction) {
>> commit_tid = journal->j_committing_transaction->t_tid;
>> + has_transaction = true;
>> + }
>> read_unlock(&journal->j_state_lock);
>> - if (commit_tid)
>> + if (has_transaction)
>> jbd2_log_wait_commit(journal, commit_tid);
>> }
>> }
>>
> --
> Jan Kara <jack@suse.com>
> SUSE Labs, CR
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 3/4] ext4: fix incorrect tid assumption in jbd2_journal_shrink_checkpoint_list()
2024-07-24 9:29 ` Jan Kara
@ 2024-07-24 13:38 ` Luis Henriques
0 siblings, 0 replies; 13+ messages in thread
From: Luis Henriques @ 2024-07-24 13:38 UTC (permalink / raw)
To: Jan Kara
Cc: Luis Henriques (SUSE), Theodore Ts'o, Andreas Dilger,
Harshad Shirwadkar, linux-ext4, linux-kernel
On Wed, Jul 24 2024, Jan Kara wrote:
> On Tue 23-07-24 16:44:01, Luis Henriques (SUSE) wrote:
>> Function jbd2_journal_shrink_checkpoint_list() assumes that '0' is not a
>> valid value for transaction IDs, which is incorrect. Don't assume that and
>> use two extra boolean variables to control the loop iterations and keep
>> track of the first and last tid.
>>
>> Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
>> ---
>> fs/jbd2/checkpoint.c | 8 ++++++--
>> 1 file changed, 6 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c
>> index 77bc522e6821..f5a594237b7a 100644
>> --- a/fs/jbd2/checkpoint.c
>> +++ b/fs/jbd2/checkpoint.c
>> @@ -410,6 +410,7 @@ unsigned long jbd2_journal_shrink_checkpoint_list(journal_t *journal,
>> tid_t tid = 0;
>> unsigned long nr_freed = 0;
>> unsigned long freed;
>> + bool is_first = true, is_last = false;
>>
>> again:
>> spin_lock(&journal->j_list_lock);
>> @@ -429,8 +430,10 @@ unsigned long jbd2_journal_shrink_checkpoint_list(journal_t *journal,
>> else
>> transaction = journal->j_checkpoint_transactions;
>>
>> - if (!first_tid)
>> + if (is_first) {
>> first_tid = transaction->t_tid;
>> + is_first = false;
>> + }
>> last_transaction = journal->j_checkpoint_transactions->t_cpprev;
>> next_transaction = transaction;
>> last_tid = last_transaction->t_tid;
>> @@ -455,12 +458,13 @@ unsigned long jbd2_journal_shrink_checkpoint_list(journal_t *journal,
>> } else {
>> journal->j_shrink_transaction = NULL;
>> next_tid = 0;
>> + is_last = true;
>> }
>>
>> spin_unlock(&journal->j_list_lock);
>> cond_resched();
>>
>> - if (*nr_to_scan && next_tid)
>> + if (*nr_to_scan && !is_last)
>
> I'd make this:
>
> if (*nr_to_scan && journal->j_shrink_transaction)
> goto again;
>
> and just remove is_last. Also we might rename is_first to first_set? At
> least to me it would be more comprehensible. Thanks!
Sure, both suggestions make sense. I'll update the patches for v2.
Cheers,
--
Luís
>
> Honza
>
>> goto again;
>> out:
>> trace_jbd2_shrink_checkpoint_list(journal, first_tid, tid, last_tid,
>>
> --
> Jan Kara <jack@suse.com>
> SUSE Labs, CR
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 4/4] ext4: fix incorrect tid assumption in ext4_fc_mark_ineligible()
2024-07-24 10:15 ` Jan Kara
@ 2024-07-24 14:02 ` Luis Henriques
2024-07-24 15:13 ` Jan Kara
0 siblings, 1 reply; 13+ messages in thread
From: Luis Henriques @ 2024-07-24 14:02 UTC (permalink / raw)
To: Jan Kara
Cc: Theodore Ts'o, Andreas Dilger, Harshad Shirwadkar, linux-ext4,
linux-kernel
On Wed, Jul 24 2024, Jan Kara wrote:
> On Tue 23-07-24 16:44:02, Luis Henriques (SUSE) wrote:
>> Function jbd2_journal_shrink_checkpoint_list() assumes that '0' is not a
>> valid value for transaction IDs, which is incorrect.
>>
>> Furthermore, the sbi->s_fc_ineligible_tid handling also makes the same
>> assumption by being initialised to '0'. Fortunately, the sb flag
>> EXT4_MF_FC_INELIGIBLE can be used to check whether sbi->s_fc_ineligible_tid
>> has been previously set instead of comparing it with '0'.
>>
>> Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
>
> Just one style nit below, otherwise looks good. Feel free to add:
>
> Reviewed-by: Jan Kara <jack@suse.cz>
>
> BTW, the ineligibility handling looks flaky to me, in particular the cases
> where we call ext4_fc_mark_ineligible() with NULL handle seem racy to me as
> fastcommit can happen *before* we mark the filesystem as ineligible. But
> that's not really related to your changes, they just made me look at that
> code in detail and I couldn't resist complaining :)
Heh, fair enough. Regarding this race, I may try to look into it but I'll
need to dig a bit more. And yeah it's probably better to separate that
from this patch.
>
>> ---
>> fs/ext4/fast_commit.c | 15 +++++++++++----
>> 1 file changed, 11 insertions(+), 4 deletions(-)
>>
>> diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
>> index 3926a05eceee..3e0793cfea38 100644
>> --- a/fs/ext4/fast_commit.c
>> +++ b/fs/ext4/fast_commit.c
>> @@ -339,22 +339,29 @@ void ext4_fc_mark_ineligible(struct super_block *sb, int reason, handle_t *handl
>> {
>> struct ext4_sb_info *sbi = EXT4_SB(sb);
>> tid_t tid;
>> + bool has_transaction = true;
>> + bool is_ineligible;
>>
>> if (ext4_fc_disabled(sb))
>> return;
>>
>> - ext4_set_mount_flag(sb, EXT4_MF_FC_INELIGIBLE);
>> if (handle && !IS_ERR(handle))
>> tid = handle->h_transaction->t_tid;
>> else {
>> read_lock(&sbi->s_journal->j_state_lock);
>> - tid = sbi->s_journal->j_running_transaction ?
>> - sbi->s_journal->j_running_transaction->t_tid : 0;
>> + if (sbi->s_journal->j_running_transaction)
>> + tid = sbi->s_journal->j_running_transaction->t_tid;
>> + else
>> + has_transaction = false;
>> read_unlock(&sbi->s_journal->j_state_lock);
>> }
>> spin_lock(&sbi->s_fc_lock);
>> - if (tid_gt(tid, sbi->s_fc_ineligible_tid))
>> + is_ineligible = ext4_test_mount_flag(sb, EXT4_MF_FC_INELIGIBLE);
>> + if (has_transaction &&
>> + ((!is_ineligible) ||
> ^^ these extra braces look strange
>
They do, indeed. I think my initial version had an explicit comparison
with 'false'. v2 will have those removed. And once again, thanks for
your review, Jan!
Cheers,
--
Luís
>> + (is_ineligible && tid_gt(tid, sbi->s_fc_ineligible_tid))))
>> sbi->s_fc_ineligible_tid = tid;
>> + ext4_set_mount_flag(sb, EXT4_MF_FC_INELIGIBLE);
>
> Honza
> --
> Jan Kara <jack@suse.com>
> SUSE Labs, CR
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 4/4] ext4: fix incorrect tid assumption in ext4_fc_mark_ineligible()
2024-07-24 14:02 ` Luis Henriques
@ 2024-07-24 15:13 ` Jan Kara
0 siblings, 0 replies; 13+ messages in thread
From: Jan Kara @ 2024-07-24 15:13 UTC (permalink / raw)
To: Luis Henriques
Cc: Jan Kara, Theodore Ts'o, Andreas Dilger, Harshad Shirwadkar,
linux-ext4, linux-kernel
On Wed 24-07-24 15:02:49, Luis Henriques wrote:
> On Wed, Jul 24 2024, Jan Kara wrote:
>
> > On Tue 23-07-24 16:44:02, Luis Henriques (SUSE) wrote:
> >> Function jbd2_journal_shrink_checkpoint_list() assumes that '0' is not a
> >> valid value for transaction IDs, which is incorrect.
> >>
> >> Furthermore, the sbi->s_fc_ineligible_tid handling also makes the same
> >> assumption by being initialised to '0'. Fortunately, the sb flag
> >> EXT4_MF_FC_INELIGIBLE can be used to check whether sbi->s_fc_ineligible_tid
> >> has been previously set instead of comparing it with '0'.
> >>
> >> Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
> >
> > Just one style nit below, otherwise looks good. Feel free to add:
> >
> > Reviewed-by: Jan Kara <jack@suse.cz>
> >
> > BTW, the ineligibility handling looks flaky to me, in particular the cases
> > where we call ext4_fc_mark_ineligible() with NULL handle seem racy to me as
> > fastcommit can happen *before* we mark the filesystem as ineligible. But
> > that's not really related to your changes, they just made me look at that
> > code in detail and I couldn't resist complaining :)
>
> Heh, fair enough. Regarding this race, I may try to look into it but I'll
> need to dig a bit more. And yeah it's probably better to separate that
> from this patch.
I suspect all the places that mark the fs as ineligible with NULL handle
need to actually mark corresponding transactions as ineligible using handle
instead. This is going to require a bit of churn e.g. for stuff like
resize or __track_dentry_update() but shouldn't be hard to do.
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2024-07-24 15:13 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-23 15:43 [PATCH 0/4] ext4: fix incorrect tid assumptions Luis Henriques (SUSE)
2024-07-23 15:43 ` [PATCH 1/4] ext4: fix incorrect tid assumption in ext4_wait_for_tail_page_commit() Luis Henriques (SUSE)
2024-07-24 9:17 ` Jan Kara
2024-07-24 13:35 ` Luis Henriques
2024-07-23 15:44 ` [PATCH 2/4] ext4: fix incorrect tid assumption in __jbd2_log_wait_for_space() Luis Henriques (SUSE)
2024-07-24 9:20 ` Jan Kara
2024-07-23 15:44 ` [PATCH 3/4] ext4: fix incorrect tid assumption in jbd2_journal_shrink_checkpoint_list() Luis Henriques (SUSE)
2024-07-24 9:29 ` Jan Kara
2024-07-24 13:38 ` Luis Henriques
2024-07-23 15:44 ` [PATCH 4/4] ext4: fix incorrect tid assumption in ext4_fc_mark_ineligible() Luis Henriques (SUSE)
2024-07-24 10:15 ` Jan Kara
2024-07-24 14:02 ` Luis Henriques
2024-07-24 15:13 ` Jan Kara
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).