public inbox for linux-bcachefs@vger.kernel.org
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: linux-bcachefs@vger.kernel.org
Subject: Re: [PATCH 1/5] bcachefs: more aggressive fast path write buffer key flushing
Date: Tue, 21 Mar 2023 09:40:10 -0400	[thread overview]
Message-ID: <ZBmzunR3GYZw8Tkl@bfoster> (raw)
In-Reply-To: <20230321132014.1438249-2-bfoster@redhat.com>

On Tue, Mar 21, 2023 at 09:20:10AM -0400, Brian Foster wrote:
> The btree write buffer flush code is prone to causing journal
> deadlock due to inefficient use and release of reservation space.
> Reservation is not pre-reserved for write buffered keys (as is done
> for key cache keys, for example), because the write buffer flush
> side uses a fast path that attempts insertion without need for any
> reservation at all.
> 
> The write buffer flush attempts to deal with this by inserting keys
> using the BTREE_INSERT_JOURNAL_RECLAIM flag to return an error on
> journal reservations that require blocking. Upon first error, it
> falls back to a slow path that inserts in journal order and supports
> moving the associated journal pin forward.
> 
> The problem is that under pathological conditions (i.e. smaller log,
> larger write buffer and journal reservation pressure), we've seen
> instances where the fast path fails fairly quickly without having
> completed many insertions, and then the slow path is unable to push
> the journal pin forward enough to free up the space it needs to
> completely flush the buffer. This problem is occasionally reproduced
> by fstest generic/333.
> 
> To avoid this problem, update the fast path algorithm to skip key
> inserts that fail due to inability to acquire needed journal
> reservation without immediately breaking out of the loop. Instead,
> insert as many keys as possible, zap the sequence numbers to mark
> them as processed, and then fall back to the slow path to process
> the remaining set in journal order. This reduces the amount of
> journal reservation that might be required to flush the entire
> buffer and increases the odds that the slow path is able to move the
> journal pin forward and free up space as keys are processed.
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> ---
>  fs/bcachefs/btree_write_buffer.c | 41 ++++++++++++++++----------------
>  1 file changed, 21 insertions(+), 20 deletions(-)
> 
> diff --git a/fs/bcachefs/btree_write_buffer.c b/fs/bcachefs/btree_write_buffer.c
> index 80f4b9839bc2..32f20e631de0 100644
> --- a/fs/bcachefs/btree_write_buffer.c
> +++ b/fs/bcachefs/btree_write_buffer.c
...
> @@ -198,23 +203,19 @@ int __bch2_btree_write_buffer_flush(struct btree_trans *trans, unsigned commit_f
>  slowpath:
>  	trace_write_buffer_flush_slowpath(trans, i - keys, nr);
>  
> -	dst = keys;
> -	for (; i < keys + nr; i++) {
> -		if (i + 1 < keys + nr &&
> -		    i[0].btree == i[1].btree &&
> -		    bpos_eq(i[0].k.k.p, i[1].k.k.p))
> -			continue;
> -
> -		*dst = *i;
> -		dst++;
> -	}
> -	nr = dst - keys;
> -

This one leaves an unused var warning. I've fixed it locally but will
wait for any further comments before reposting.

Brian

> +	/*
> +	 * Now sort the rest by journal seq and bump the journal pin as we go.
> +	 * The slowpath zapped the seq of keys that were successfully flushed so
> +	 * we can skip those here.
> +	 */
>  	sort(keys, nr, sizeof(keys[0]),
>  	     btree_write_buffered_journal_cmp,
>  	     NULL);
>  
>  	for (i = keys; i < keys + nr; i++) {
> +		if (!i->journal_seq)
> +			continue;
> +
>  		if (i->journal_seq > pin.seq) {
>  			struct journal_entry_pin pin2;
>  
> -- 
> 2.39.2
> 


  reply	other threads:[~2023-03-21 13:39 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-21 13:20 [PATCH 0/5] bcachefs: journal stall fixes Brian Foster
2023-03-21 13:20 ` [PATCH 1/5] bcachefs: more aggressive fast path write buffer key flushing Brian Foster
2023-03-21 13:40   ` Brian Foster [this message]
2023-03-21 13:20 ` [PATCH 2/5] bcachefs: gracefully unwind journal res slowpath on shutdown Brian Foster
2023-03-21 13:20 ` [PATCH 3/5] bcachefs: refactor journal stuck checking into standalone helper Brian Foster
2023-03-21 13:20 ` [PATCH 4/5] bcachefs: drop unnecessary journal stuck check from space calculation Brian Foster
2023-03-21 13:20 ` [PATCH 5/5] RFC: bcachefs: use a timeout for the journal stuck condition Brian Foster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZBmzunR3GYZw8Tkl@bfoster \
    --to=bfoster@redhat.com \
    --cc=linux-bcachefs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox