From: Brian Foster <bfoster@redhat.com>
To: linux-bcachefs@vger.kernel.org
Subject: Re: [PATCH 1/5] bcachefs: more aggressive fast path write buffer key flushing
Date: Tue, 21 Mar 2023 09:40:10 -0400 [thread overview]
Message-ID: <ZBmzunR3GYZw8Tkl@bfoster> (raw)
In-Reply-To: <20230321132014.1438249-2-bfoster@redhat.com>
On Tue, Mar 21, 2023 at 09:20:10AM -0400, Brian Foster wrote:
> The btree write buffer flush code is prone to causing journal
> deadlock due to inefficient use and release of reservation space.
> Reservation is not pre-reserved for write buffered keys (as is done
> for key cache keys, for example), because the write buffer flush
> side uses a fast path that attempts insertion without need for any
> reservation at all.
>
> The write buffer flush attempts to deal with this by inserting keys
> using the BTREE_INSERT_JOURNAL_RECLAIM flag to return an error on
> journal reservations that require blocking. Upon first error, it
> falls back to a slow path that inserts in journal order and supports
> moving the associated journal pin forward.
>
> The problem is that under pathological conditions (i.e. smaller log,
> larger write buffer and journal reservation pressure), we've seen
> instances where the fast path fails fairly quickly without having
> completed many insertions, and then the slow path is unable to push
> the journal pin forward enough to free up the space it needs to
> completely flush the buffer. This problem is occasionally reproduced
> by fstest generic/333.
>
> To avoid this problem, update the fast path algorithm to skip key
> inserts that fail due to inability to acquire needed journal
> reservation without immediately breaking out of the loop. Instead,
> insert as many keys as possible, zap the sequence numbers to mark
> them as processed, and then fall back to the slow path to process
> the remaining set in journal order. This reduces the amount of
> journal reservation that might be required to flush the entire
> buffer and increases the odds that the slow path is able to move the
> journal pin forward and free up space as keys are processed.
>
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> ---
> fs/bcachefs/btree_write_buffer.c | 41 ++++++++++++++++----------------
> 1 file changed, 21 insertions(+), 20 deletions(-)
>
> diff --git a/fs/bcachefs/btree_write_buffer.c b/fs/bcachefs/btree_write_buffer.c
> index 80f4b9839bc2..32f20e631de0 100644
> --- a/fs/bcachefs/btree_write_buffer.c
> +++ b/fs/bcachefs/btree_write_buffer.c
...
> @@ -198,23 +203,19 @@ int __bch2_btree_write_buffer_flush(struct btree_trans *trans, unsigned commit_f
> slowpath:
> trace_write_buffer_flush_slowpath(trans, i - keys, nr);
>
> - dst = keys;
> - for (; i < keys + nr; i++) {
> - if (i + 1 < keys + nr &&
> - i[0].btree == i[1].btree &&
> - bpos_eq(i[0].k.k.p, i[1].k.k.p))
> - continue;
> -
> - *dst = *i;
> - dst++;
> - }
> - nr = dst - keys;
> -
This one leaves an unused var warning. I've fixed it locally but will
wait for any further comments before reposting.
Brian
> + /*
> + * Now sort the rest by journal seq and bump the journal pin as we go.
> + * The slowpath zapped the seq of keys that were successfully flushed so
> + * we can skip those here.
> + */
> sort(keys, nr, sizeof(keys[0]),
> btree_write_buffered_journal_cmp,
> NULL);
>
> for (i = keys; i < keys + nr; i++) {
> + if (!i->journal_seq)
> + continue;
> +
> if (i->journal_seq > pin.seq) {
> struct journal_entry_pin pin2;
>
> --
> 2.39.2
>
next prev parent reply other threads:[~2023-03-21 13:39 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-21 13:20 [PATCH 0/5] bcachefs: journal stall fixes Brian Foster
2023-03-21 13:20 ` [PATCH 1/5] bcachefs: more aggressive fast path write buffer key flushing Brian Foster
2023-03-21 13:40 ` Brian Foster [this message]
2023-03-21 13:20 ` [PATCH 2/5] bcachefs: gracefully unwind journal res slowpath on shutdown Brian Foster
2023-03-21 13:20 ` [PATCH 3/5] bcachefs: refactor journal stuck checking into standalone helper Brian Foster
2023-03-21 13:20 ` [PATCH 4/5] bcachefs: drop unnecessary journal stuck check from space calculation Brian Foster
2023-03-21 13:20 ` [PATCH 5/5] RFC: bcachefs: use a timeout for the journal stuck condition Brian Foster
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZBmzunR3GYZw8Tkl@bfoster \
--to=bfoster@redhat.com \
--cc=linux-bcachefs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox