* [PATCH 0/3] jbd2: some fixes and cleanup about "jbd2: fix several checkpoint inconsistent issues" @ 2023-07-14 2:55 Zhang Yi 2023-07-14 2:55 ` [PATCH 1/3] jbd2: fix checkpoint cleanup performance regression Zhang Yi ` (3 more replies) 0 siblings, 4 replies; 7+ messages in thread From: Zhang Yi @ 2023-07-14 2:55 UTC (permalink / raw) To: linux-ext4 Cc: tytso, adilger.kernel, jack, yi.zhang, yi.zhang, chengzhihao1, yang.lee, yukuai3 From: Zhang Yi <yi.zhang@huawei.com> These patch set includes 2 fixes and 1 cleanup about "jbd2: fix several checkpoint inconsistent issues" patch set[1] recently applied in dev branch. The first patch fix a performance regression introduced while merging journal_shrink_one_cp_list(). The second one add a missing check before dropping checkpoint buffer that could lead to filesystem inconsistent. The last patch which is from Yang Li, remove the unused __cp_buffer_busy() helper. Please see the corresponding patch for details. [1] https://lore.kernel.org/linux-ext4/168918657577.3681557.17979362698032386800.b4-ty@mit.edu/T/#t Thanks, Yi. Yang Li (1): jbd2: remove unused function '__cp_buffer_busy' Zhang Yi (1): jbd2: fix checkpoint cleanup performance regression Zhihao Cheng (1): jbd2: Check 'jh->b_transaction' before remove it from checkpoint fs/jbd2/checkpoint.c | 34 ++++++++++++++++------------------ 1 file changed, 16 insertions(+), 18 deletions(-) -- 2.39.2 ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/3] jbd2: fix checkpoint cleanup performance regression 2023-07-14 2:55 [PATCH 0/3] jbd2: some fixes and cleanup about "jbd2: fix several checkpoint inconsistent issues" Zhang Yi @ 2023-07-14 2:55 ` Zhang Yi 2023-07-31 16:06 ` Jan Kara 2023-07-14 2:55 ` [PATCH 2/3] jbd2: Check 'jh->b_transaction' before remove it from checkpoint Zhang Yi ` (2 subsequent siblings) 3 siblings, 1 reply; 7+ messages in thread From: Zhang Yi @ 2023-07-14 2:55 UTC (permalink / raw) To: linux-ext4 Cc: tytso, adilger.kernel, jack, yi.zhang, yi.zhang, chengzhihao1, yang.lee, yukuai3 From: Zhang Yi <yi.zhang@huawei.com> journal_clean_one_cp_list() has been merged into journal_shrink_one_cp_list(), but do chekpoint buffer cleanup from the committing process is just a best effort, it should stop scan once it meet a busy buffer, or else it will cause a lot of invalid buffer scan and checks. We catch a performance regression when doing fs_mark tests below. Test cmd: ./fs_mark -d scratch -s 1024 -n 10000 -t 1 -D 100 -N 100 Before merging checkpoint buffer cleanup: FSUse% Count Size Files/sec App Overhead 95 10000 1024 8304.9 49033 After merging checkpoint buffer cleanup: FSUse% Count Size Files/sec App Overhead 95 10000 1024 7649.0 50012 FSUse% Count Size Files/sec App Overhead 95 10000 1024 2107.1 50871 After merging checkpoint buffer cleanup, the total loop count in journal_shrink_one_cp_list() could be up to 6,261,600+ (50,000+ ~ 100,000+ in general), most of them are invalid. This patch fix it through passing 'shrink_type' into journal_shrink_one_cp_list() and add a new 'SHRINK_BUSY_STOP' to indicate it should stop once meet a busy buffer. After fix, the loop count descending back to 10,000+. After this fix: FSUse% Count Size Files/sec App Overhead 95 10000 1024 8558.4 49109 Fixes: b98dba273a0e ("jbd2: remove journal_clean_one_cp_list()") Signed-off-by: Zhang Yi <yi.zhang@huawei.com> --- fs/jbd2/checkpoint.c | 20 ++++++++++++++------ 1 file changed, 14 insertions(+), 6 deletions(-) diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c index 9ec91017a7f3..936c6d758a65 100644 --- a/fs/jbd2/checkpoint.c +++ b/fs/jbd2/checkpoint.c @@ -349,6 +349,8 @@ int jbd2_cleanup_journal_tail(journal_t *journal) /* Checkpoint list management */ +enum shrink_type {SHRINK_DESTROY, SHRINK_BUSY_STOP, SHRINK_BUSY_SKIP}; + /* * journal_shrink_one_cp_list * @@ -360,7 +362,8 @@ int jbd2_cleanup_journal_tail(journal_t *journal) * Called with j_list_lock held. */ static unsigned long journal_shrink_one_cp_list(struct journal_head *jh, - bool destroy, bool *released) + enum shrink_type type, + bool *released) { struct journal_head *last_jh; struct journal_head *next_jh = jh; @@ -376,12 +379,15 @@ static unsigned long journal_shrink_one_cp_list(struct journal_head *jh, jh = next_jh; next_jh = jh->b_cpnext; - if (destroy) { + if (type == SHRINK_DESTROY) { ret = __jbd2_journal_remove_checkpoint(jh); } else { ret = jbd2_journal_try_remove_checkpoint(jh); - if (ret < 0) - continue; + if (ret < 0) { + if (type == SHRINK_BUSY_SKIP) + continue; + break; + } } nr_freed++; @@ -445,7 +451,7 @@ unsigned long jbd2_journal_shrink_checkpoint_list(journal_t *journal, tid = transaction->t_tid; freed = journal_shrink_one_cp_list(transaction->t_checkpoint_list, - false, &released); + SHRINK_BUSY_SKIP, &released); nr_freed += freed; (*nr_to_scan) -= min(*nr_to_scan, freed); if (*nr_to_scan == 0) @@ -485,19 +491,21 @@ unsigned long jbd2_journal_shrink_checkpoint_list(journal_t *journal, void __jbd2_journal_clean_checkpoint_list(journal_t *journal, bool destroy) { transaction_t *transaction, *last_transaction, *next_transaction; + enum shrink_type type; bool released; transaction = journal->j_checkpoint_transactions; if (!transaction) return; + type = destroy ? SHRINK_DESTROY : SHRINK_BUSY_STOP; last_transaction = transaction->t_cpprev; next_transaction = transaction; do { transaction = next_transaction; next_transaction = transaction->t_cpnext; journal_shrink_one_cp_list(transaction->t_checkpoint_list, - destroy, &released); + type, &released); /* * This function only frees up some memory if possible so we * dont have an obligation to finish processing. Bail out if -- 2.39.2 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 1/3] jbd2: fix checkpoint cleanup performance regression 2023-07-14 2:55 ` [PATCH 1/3] jbd2: fix checkpoint cleanup performance regression Zhang Yi @ 2023-07-31 16:06 ` Jan Kara 0 siblings, 0 replies; 7+ messages in thread From: Jan Kara @ 2023-07-31 16:06 UTC (permalink / raw) To: Zhang Yi Cc: linux-ext4, tytso, adilger.kernel, jack, yi.zhang, chengzhihao1, yang.lee, yukuai3 On Fri 14-07-23 10:55:26, Zhang Yi wrote: > From: Zhang Yi <yi.zhang@huawei.com> > > journal_clean_one_cp_list() has been merged into > journal_shrink_one_cp_list(), but do chekpoint buffer cleanup from the > committing process is just a best effort, it should stop scan once it > meet a busy buffer, or else it will cause a lot of invalid buffer scan > and checks. We catch a performance regression when doing fs_mark tests > below. > > Test cmd: > ./fs_mark -d scratch -s 1024 -n 10000 -t 1 -D 100 -N 100 > > Before merging checkpoint buffer cleanup: > FSUse% Count Size Files/sec App Overhead > 95 10000 1024 8304.9 49033 > > After merging checkpoint buffer cleanup: > FSUse% Count Size Files/sec App Overhead > 95 10000 1024 7649.0 50012 > FSUse% Count Size Files/sec App Overhead > 95 10000 1024 2107.1 50871 > > After merging checkpoint buffer cleanup, the total loop count in > journal_shrink_one_cp_list() could be up to 6,261,600+ (50,000+ ~ > 100,000+ in general), most of them are invalid. This patch fix it > through passing 'shrink_type' into journal_shrink_one_cp_list() and add > a new 'SHRINK_BUSY_STOP' to indicate it should stop once meet a busy > buffer. After fix, the loop count descending back to 10,000+. > > After this fix: > FSUse% Count Size Files/sec App Overhead > 95 10000 1024 8558.4 49109 > > Fixes: b98dba273a0e ("jbd2: remove journal_clean_one_cp_list()") > Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Looks good. Feel free to add: Reviewed-by: Jan Kara <jack@suse.cz> Honza > --- > fs/jbd2/checkpoint.c | 20 ++++++++++++++------ > 1 file changed, 14 insertions(+), 6 deletions(-) > > diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c > index 9ec91017a7f3..936c6d758a65 100644 > --- a/fs/jbd2/checkpoint.c > +++ b/fs/jbd2/checkpoint.c > @@ -349,6 +349,8 @@ int jbd2_cleanup_journal_tail(journal_t *journal) > > /* Checkpoint list management */ > > +enum shrink_type {SHRINK_DESTROY, SHRINK_BUSY_STOP, SHRINK_BUSY_SKIP}; > + > /* > * journal_shrink_one_cp_list > * > @@ -360,7 +362,8 @@ int jbd2_cleanup_journal_tail(journal_t *journal) > * Called with j_list_lock held. > */ > static unsigned long journal_shrink_one_cp_list(struct journal_head *jh, > - bool destroy, bool *released) > + enum shrink_type type, > + bool *released) > { > struct journal_head *last_jh; > struct journal_head *next_jh = jh; > @@ -376,12 +379,15 @@ static unsigned long journal_shrink_one_cp_list(struct journal_head *jh, > jh = next_jh; > next_jh = jh->b_cpnext; > > - if (destroy) { > + if (type == SHRINK_DESTROY) { > ret = __jbd2_journal_remove_checkpoint(jh); > } else { > ret = jbd2_journal_try_remove_checkpoint(jh); > - if (ret < 0) > - continue; > + if (ret < 0) { > + if (type == SHRINK_BUSY_SKIP) > + continue; > + break; > + } > } > > nr_freed++; > @@ -445,7 +451,7 @@ unsigned long jbd2_journal_shrink_checkpoint_list(journal_t *journal, > tid = transaction->t_tid; > > freed = journal_shrink_one_cp_list(transaction->t_checkpoint_list, > - false, &released); > + SHRINK_BUSY_SKIP, &released); > nr_freed += freed; > (*nr_to_scan) -= min(*nr_to_scan, freed); > if (*nr_to_scan == 0) > @@ -485,19 +491,21 @@ unsigned long jbd2_journal_shrink_checkpoint_list(journal_t *journal, > void __jbd2_journal_clean_checkpoint_list(journal_t *journal, bool destroy) > { > transaction_t *transaction, *last_transaction, *next_transaction; > + enum shrink_type type; > bool released; > > transaction = journal->j_checkpoint_transactions; > if (!transaction) > return; > > + type = destroy ? SHRINK_DESTROY : SHRINK_BUSY_STOP; > last_transaction = transaction->t_cpprev; > next_transaction = transaction; > do { > transaction = next_transaction; > next_transaction = transaction->t_cpnext; > journal_shrink_one_cp_list(transaction->t_checkpoint_list, > - destroy, &released); > + type, &released); > /* > * This function only frees up some memory if possible so we > * dont have an obligation to finish processing. Bail out if > -- > 2.39.2 > -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 2/3] jbd2: Check 'jh->b_transaction' before remove it from checkpoint 2023-07-14 2:55 [PATCH 0/3] jbd2: some fixes and cleanup about "jbd2: fix several checkpoint inconsistent issues" Zhang Yi 2023-07-14 2:55 ` [PATCH 1/3] jbd2: fix checkpoint cleanup performance regression Zhang Yi @ 2023-07-14 2:55 ` Zhang Yi 2023-07-31 16:15 ` Jan Kara 2023-07-14 2:55 ` [PATCH 3/3] jbd2: remove unused function '__cp_buffer_busy' Zhang Yi 2023-08-05 12:20 ` [PATCH 0/3] jbd2: some fixes and cleanup about "jbd2: fix several checkpoint inconsistent issues" Theodore Ts'o 3 siblings, 1 reply; 7+ messages in thread From: Zhang Yi @ 2023-07-14 2:55 UTC (permalink / raw) To: linux-ext4 Cc: tytso, adilger.kernel, jack, yi.zhang, yi.zhang, chengzhihao1, yang.lee, yukuai3 From: Zhihao Cheng <chengzhihao1@huawei.com> Following process will corrupt ext4 image: Step 1: jbd2_journal_commit_transaction __jbd2_journal_insert_checkpoint(jh, commit_transaction) // Put jh into trans1->t_checkpoint_list journal->j_checkpoint_transactions = commit_transaction // Put trans1 into journal->j_checkpoint_transactions Step 2: do_get_write_access test_clear_buffer_dirty(bh) // clear buffer dirty,set jbd dirty __jbd2_journal_file_buffer(jh, transaction) // jh belongs to trans2 Step 3: drop_cache journal_shrink_one_cp_list jbd2_journal_try_remove_checkpoint if (!trylock_buffer(bh)) // lock bh, true if (buffer_dirty(bh)) // buffer is not dirty __jbd2_journal_remove_checkpoint(jh) // remove jh from trans1->t_checkpoint_list Step 4: jbd2_log_do_checkpoint trans1 = journal->j_checkpoint_transactions // jh is not in trans1->t_checkpoint_list jbd2_cleanup_journal_tail(journal) // trans1 is done Step 5: Power cut, trans2 is not committed, jh is lost in next mounting. Fix it by checking 'jh->b_transaction' before remove it from checkpoint. Fixes: 46f881b5b175 ("jbd2: fix a race when checking checkpoint buffer busy") Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Zhang Yi <yi.zhang@huawei.com> --- fs/jbd2/checkpoint.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c index 936c6d758a65..f033ac807013 100644 --- a/fs/jbd2/checkpoint.c +++ b/fs/jbd2/checkpoint.c @@ -639,6 +639,8 @@ int jbd2_journal_try_remove_checkpoint(struct journal_head *jh) { struct buffer_head *bh = jh2bh(jh); + if (jh->b_transaction) + return -EBUSY; if (!trylock_buffer(bh)) return -EBUSY; if (buffer_dirty(bh)) { -- 2.39.2 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 2/3] jbd2: Check 'jh->b_transaction' before remove it from checkpoint 2023-07-14 2:55 ` [PATCH 2/3] jbd2: Check 'jh->b_transaction' before remove it from checkpoint Zhang Yi @ 2023-07-31 16:15 ` Jan Kara 0 siblings, 0 replies; 7+ messages in thread From: Jan Kara @ 2023-07-31 16:15 UTC (permalink / raw) To: Zhang Yi Cc: linux-ext4, tytso, adilger.kernel, jack, yi.zhang, chengzhihao1, yang.lee, yukuai3 On Fri 14-07-23 10:55:27, Zhang Yi wrote: > From: Zhihao Cheng <chengzhihao1@huawei.com> > > Following process will corrupt ext4 image: > Step 1: > jbd2_journal_commit_transaction > __jbd2_journal_insert_checkpoint(jh, commit_transaction) > // Put jh into trans1->t_checkpoint_list > journal->j_checkpoint_transactions = commit_transaction > // Put trans1 into journal->j_checkpoint_transactions > > Step 2: > do_get_write_access > test_clear_buffer_dirty(bh) // clear buffer dirty,set jbd dirty > __jbd2_journal_file_buffer(jh, transaction) // jh belongs to trans2 > > Step 3: > drop_cache > journal_shrink_one_cp_list > jbd2_journal_try_remove_checkpoint > if (!trylock_buffer(bh)) // lock bh, true > if (buffer_dirty(bh)) // buffer is not dirty > __jbd2_journal_remove_checkpoint(jh) > // remove jh from trans1->t_checkpoint_list > > Step 4: > jbd2_log_do_checkpoint > trans1 = journal->j_checkpoint_transactions > // jh is not in trans1->t_checkpoint_list > jbd2_cleanup_journal_tail(journal) // trans1 is done > > Step 5: Power cut, trans2 is not committed, jh is lost in next mounting. > > Fix it by checking 'jh->b_transaction' before remove it from checkpoint. > > Fixes: 46f881b5b175 ("jbd2: fix a race when checking checkpoint buffer busy") > Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> > Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Indeed! I've missed this difference between __cp_buffer_busy() and jbd2_journal_try_remove_checkpoint() during my review of 46f881b5b175. The fix looks good. Feel free to add: Reviewed-by: Jan Kara <jack@suse.cz> Honza > --- > fs/jbd2/checkpoint.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c > index 936c6d758a65..f033ac807013 100644 > --- a/fs/jbd2/checkpoint.c > +++ b/fs/jbd2/checkpoint.c > @@ -639,6 +639,8 @@ int jbd2_journal_try_remove_checkpoint(struct journal_head *jh) > { > struct buffer_head *bh = jh2bh(jh); > > + if (jh->b_transaction) > + return -EBUSY; > if (!trylock_buffer(bh)) > return -EBUSY; > if (buffer_dirty(bh)) { > -- > 2.39.2 > -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 3/3] jbd2: remove unused function '__cp_buffer_busy' 2023-07-14 2:55 [PATCH 0/3] jbd2: some fixes and cleanup about "jbd2: fix several checkpoint inconsistent issues" Zhang Yi 2023-07-14 2:55 ` [PATCH 1/3] jbd2: fix checkpoint cleanup performance regression Zhang Yi 2023-07-14 2:55 ` [PATCH 2/3] jbd2: Check 'jh->b_transaction' before remove it from checkpoint Zhang Yi @ 2023-07-14 2:55 ` Zhang Yi 2023-08-05 12:20 ` [PATCH 0/3] jbd2: some fixes and cleanup about "jbd2: fix several checkpoint inconsistent issues" Theodore Ts'o 3 siblings, 0 replies; 7+ messages in thread From: Zhang Yi @ 2023-07-14 2:55 UTC (permalink / raw) To: linux-ext4 Cc: tytso, adilger.kernel, jack, yi.zhang, yi.zhang, chengzhihao1, yang.lee, yukuai3 From: Yang Li <yang.lee@linux.alibaba.com> The code calling function '__cp_buffer_busy' has been removed, so the function should also be removed. silence the warning: fs/jbd2/checkpoint.c:48:20: warning: unused function '__cp_buffer_busy' Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=5518 Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Reviewed-by: Jan Kara <jack@suse.cz> --- fs/jbd2/checkpoint.c | 12 ------------ 1 file changed, 12 deletions(-) diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c index f033ac807013..118699fff2f9 100644 --- a/fs/jbd2/checkpoint.c +++ b/fs/jbd2/checkpoint.c @@ -40,18 +40,6 @@ static inline void __buffer_unlink(struct journal_head *jh) } } -/* - * Check a checkpoint buffer could be release or not. - * - * Requires j_list_lock - */ -static inline bool __cp_buffer_busy(struct journal_head *jh) -{ - struct buffer_head *bh = jh2bh(jh); - - return (jh->b_transaction || buffer_locked(bh) || buffer_dirty(bh)); -} - /* * __jbd2_log_wait_for_space: wait until there is space in the journal. * -- 2.39.2 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 0/3] jbd2: some fixes and cleanup about "jbd2: fix several checkpoint inconsistent issues" 2023-07-14 2:55 [PATCH 0/3] jbd2: some fixes and cleanup about "jbd2: fix several checkpoint inconsistent issues" Zhang Yi ` (2 preceding siblings ...) 2023-07-14 2:55 ` [PATCH 3/3] jbd2: remove unused function '__cp_buffer_busy' Zhang Yi @ 2023-08-05 12:20 ` Theodore Ts'o 3 siblings, 0 replies; 7+ messages in thread From: Theodore Ts'o @ 2023-08-05 12:20 UTC (permalink / raw) To: linux-ext4, Zhang Yi Cc: Theodore Ts'o, adilger.kernel, jack, yi.zhang, chengzhihao1, yang.lee, yukuai3 On Fri, 14 Jul 2023 10:55:25 +0800, Zhang Yi wrote: > These patch set includes 2 fixes and 1 cleanup about "jbd2: fix several > checkpoint inconsistent issues" patch set[1] recently applied in dev > branch. The first patch fix a performance regression introduced while > merging journal_shrink_one_cp_list(). The second one add a missing > check before dropping checkpoint buffer that could lead to filesystem > inconsistent. The last patch which is from Yang Li, remove the unused > __cp_buffer_busy() helper. Please see the corresponding patch for > details. > > [...] Applied, thanks! [1/3] jbd2: fix checkpoint cleanup performance regression commit: 373ac521799d9e97061515aca6ec6621789036bb [2/3] jbd2: Check 'jh->b_transaction' before remove it from checkpoint commit: 590a809ff743e7bd890ba5fb36bc38e20a36de53 [3/3] jbd2: remove unused function '__cp_buffer_busy' commit: 5f02a30eac5cc1c081cbdb42d19fd0ded00b0618 Best regards, -- Theodore Ts'o <tytso@mit.edu> ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2023-08-05 12:21 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-07-14 2:55 [PATCH 0/3] jbd2: some fixes and cleanup about "jbd2: fix several checkpoint inconsistent issues" Zhang Yi 2023-07-14 2:55 ` [PATCH 1/3] jbd2: fix checkpoint cleanup performance regression Zhang Yi 2023-07-31 16:06 ` Jan Kara 2023-07-14 2:55 ` [PATCH 2/3] jbd2: Check 'jh->b_transaction' before remove it from checkpoint Zhang Yi 2023-07-31 16:15 ` Jan Kara 2023-07-14 2:55 ` [PATCH 3/3] jbd2: remove unused function '__cp_buffer_busy' Zhang Yi 2023-08-05 12:20 ` [PATCH 0/3] jbd2: some fixes and cleanup about "jbd2: fix several checkpoint inconsistent issues" Theodore Ts'o
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).