From: Brian Foster <bfoster@redhat.com>
To: Kent Overstreet <kent.overstreet@linux.dev>
Cc: linux-bcachefs@vger.kernel.org
Subject: Re: [PATCH 07/17] bcachefs: Make journal replay more efficient
Date: Tue, 14 Nov 2023 08:19:04 -0500 [thread overview]
Message-ID: <ZVNzyMRK6BHeEIiS@bfoster> (raw)
In-Reply-To: <20231110163157.2736111-8-kent.overstreet@linux.dev>
On Fri, Nov 10, 2023 at 11:31:44AM -0500, Kent Overstreet wrote:
> Journal replay now first attempts to replay keys in sorted order,
> similar to how the btree write buffer flush path works.
>
> Any keys that can not be replayed due to journal deadlock are then left
> for later and replayed in journal order, unpinning journal entries as we
> go.
>
> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
> ---
Just a question and a nit..
> fs/bcachefs/errcode.h | 1 -
> fs/bcachefs/recovery.c | 84 ++++++++++++++++++++++++++++--------------
> 2 files changed, 56 insertions(+), 29 deletions(-)
>
...
> diff --git a/fs/bcachefs/recovery.c b/fs/bcachefs/recovery.c
> index 6eafff2557a4..3524d1e0003e 100644
> --- a/fs/bcachefs/recovery.c
> +++ b/fs/bcachefs/recovery.c
...
> @@ -166,26 +159,61 @@ static int bch2_journal_replay(struct bch_fs *c)
> goto err;
> }
>
> - for (i = 0; i < keys->nr; i++) {
> - k = keys_sorted[i];
> + /*
> + * First, attempt to replay keys in sorted order. This is more
> + * efficient, but some might fail if that would cause a journal
> + * deadlock.
> + */
What does "sorted order" mean here vs. when we sort by journal seq
below?
> + for (size_t i = 0; i < keys->nr; i++) {
> + cond_resched();
> +
> + struct journal_key *k = keys->d + i;
> +
> + ret = commit_do(trans, NULL, NULL,
> + BTREE_INSERT_NOFAIL|
> + BTREE_INSERT_JOURNAL_RECLAIM|
> + (!k->allocated ? BTREE_INSERT_JOURNAL_REPLAY : 0),
> + bch2_journal_replay_key(trans, k));
> + BUG_ON(!ret && !k->overwritten);
> + if (ret) {
> + ret = darray_push(&keys_sorted, k);
> + if (ret)
> + goto err;
> + }
> + }
> +
> + /*
> + * Now, replay any remaining keys in the order in which they appear in
> + * the journal, unpinning those journal entries as we go:
> + */
> + sort(keys_sorted.data, keys_sorted.nr,
> + sizeof(keys_sorted.data[0]),
> + journal_sort_seq_cmp, NULL);
>
> + darray_for_each(keys_sorted, kp) {
> cond_resched();
>
> + struct journal_key *k = *kp;
> +
> replay_now_at(j, k->journal_seq);
>
> - ret = bch2_trans_do(c, NULL, NULL,
> - BTREE_INSERT_NOFAIL|
> - (!k->allocated
> - ? BTREE_INSERT_JOURNAL_REPLAY|BCH_WATERMARK_reclaim
> - : 0),
> + ret = commit_do(trans, NULL, NULL,
> + BTREE_INSERT_NOFAIL|
> + (!k->allocated
> + ? BTREE_INSERT_JOURNAL_REPLAY|BCH_WATERMARK_reclaim
> + : 0),
> bch2_journal_replay_key(trans, k));
> - if (ret) {
> - bch_err(c, "journal replay: error while replaying key at btree %s level %u: %s",
> - bch2_btree_id_str(k->btree_id), k->level, bch2_err_str(ret));
> + bch_err_msg(c, ret, "while replaying key at btree %s level %u:",
> + bch2_btree_id_str(k->btree_id), k->level);
> + if (ret)
> goto err;
> - }
> +
> + BUG_ON(!k->overwritten);
> }
>
> + bch2_trans_put(trans);
> + trans = NULL;
> +
Nit: This looks spurious w/ the exit path, just FWIW.
Brian
> replay_now_at(j, j->replay_journal_seq_end);
> j->replay_journal_seq = 0;
>
> @@ -196,10 +224,10 @@ static int bch2_journal_replay(struct bch_fs *c)
> if (keys->nr && !ret)
> bch2_journal_log_msg(c, "journal replay finished");
> err:
> - kvfree(keys_sorted);
> -
> - if (ret)
> - bch_err_fn(c, ret);
> + if (trans)
> + bch2_trans_put(trans);
> + darray_exit(&keys_sorted);
> + bch_err_fn(c, ret);
> return ret;
> }
>
> --
> 2.42.0
>
>
next prev parent reply other threads:[~2023-11-14 13:18 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-10 16:31 [PATCH 00/17] btree write buffer & journal optimizations Kent Overstreet
2023-11-10 16:31 ` [PATCH 01/17] bcachefs: Kill journal pre-reservations Kent Overstreet
2023-11-10 16:31 ` [PATCH 02/17] bcachefs: track_event_change() Kent Overstreet
2023-11-10 16:31 ` [PATCH 03/17] bcachefs: Journal pins must always have a flush_fn Kent Overstreet
2023-11-13 15:22 ` Brian Foster
2023-11-13 16:36 ` Kent Overstreet
2023-11-13 17:08 ` Brian Foster
2023-11-10 16:31 ` [PATCH 04/17] bcachefs: BTREE_INSERT_JOURNAL_REPLAY now "don't init trans->journal_res" Kent Overstreet
2023-11-10 16:31 ` [PATCH 05/17] bcachefs: Kill BTREE_UPDATE_PREJOURNAL Kent Overstreet
2023-11-13 15:29 ` Brian Foster
2023-11-13 16:49 ` Kent Overstreet
2023-11-14 13:17 ` Brian Foster
2023-11-10 16:31 ` [PATCH 06/17] bcachefs: Go rw before journal replay Kent Overstreet
2023-11-10 16:31 ` [PATCH 07/17] bcachefs: Make journal replay more efficient Kent Overstreet
2023-11-14 13:19 ` Brian Foster [this message]
2023-11-15 1:50 ` Kent Overstreet
2023-11-10 16:31 ` [PATCH 08/17] bcachefs: Don't flush journal after replay Kent Overstreet
2023-11-10 16:31 ` [PATCH 09/17] bcachefs: Unwritten journal buffers are always dirty Kent Overstreet
2023-11-10 16:31 ` [PATCH 10/17] bcachefs: journal->buf_lock Kent Overstreet
2023-11-10 16:31 ` [PATCH 11/17] bcachefs: bch2_journal_block_reservations() Kent Overstreet
2023-11-10 16:31 ` [PATCH 12/17] bcachefs: Clean up btree write buffer write ref handling Kent Overstreet
2023-11-10 16:31 ` [PATCH 13/17] bcachefs: bch2_btree_write_buffer_flush_locked() Kent Overstreet
2023-11-10 16:31 ` [PATCH 14/17] bcachefs: bch2_btree_write_buffer_flush() -> bch2_btree_write_buffer_tryflush() Kent Overstreet
2023-11-10 16:31 ` [PATCH 15/17] bcachefs: Improve btree write buffer tracepoints Kent Overstreet
2023-11-10 16:31 ` [PATCH 16/17] bcachefs: btree write buffer now slurps keys from journal Kent Overstreet
2023-11-21 10:56 ` Geert Uytterhoeven
2023-11-21 16:52 ` Kent Overstreet
2023-11-10 16:31 ` [PATCH 17/17] bcachefs: Inline btree write buffer sort Kent Overstreet
2023-11-10 16:42 ` [PATCH 00/17] btree write buffer & journal optimizations Kent Overstreet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZVNzyMRK6BHeEIiS@bfoster \
--to=bfoster@redhat.com \
--cc=kent.overstreet@linux.dev \
--cc=linux-bcachefs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox