All of lore.kernel.org
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: linux-bcachefs@vger.kernel.org
Subject: Re: [PATCH 1/5] bcachefs: more aggressive fast path write buffer key flushing
Date: Tue, 21 Mar 2023 09:40:10 -0400	[thread overview]
Message-ID: <ZBmzunR3GYZw8Tkl@bfoster> (raw)
In-Reply-To: <20230321132014.1438249-2-bfoster@redhat.com>

On Tue, Mar 21, 2023 at 09:20:10AM -0400, Brian Foster wrote:
> The btree write buffer flush code is prone to causing journal
> deadlock due to inefficient use and release of reservation space.
> Reservation is not pre-reserved for write buffered keys (as is done
> for key cache keys, for example), because the write buffer flush
> side uses a fast path that attempts insertion without need for any
> reservation at all.
> 
> The write buffer flush attempts to deal with this by inserting keys
> using the BTREE_INSERT_JOURNAL_RECLAIM flag to return an error on
> journal reservations that require blocking. Upon first error, it
> falls back to a slow path that inserts in journal order and supports
> moving the associated journal pin forward.
> 
> The problem is that under pathological conditions (i.e. smaller log,
> larger write buffer and journal reservation pressure), we've seen
> instances where the fast path fails fairly quickly without having
> completed many insertions, and then the slow path is unable to push
> the journal pin forward enough to free up the space it needs to
> completely flush the buffer. This problem is occasionally reproduced
> by fstest generic/333.
> 
> To avoid this problem, update the fast path algorithm to skip key
> inserts that fail due to inability to acquire needed journal
> reservation without immediately breaking out of the loop. Instead,
> insert as many keys as possible, zap the sequence numbers to mark
> them as processed, and then fall back to the slow path to process
> the remaining set in journal order. This reduces the amount of
> journal reservation that might be required to flush the entire
> buffer and increases the odds that the slow path is able to move the
> journal pin forward and free up space as keys are processed.
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> ---
>  fs/bcachefs/btree_write_buffer.c | 41 ++++++++++++++++----------------
>  1 file changed, 21 insertions(+), 20 deletions(-)
> 
> diff --git a/fs/bcachefs/btree_write_buffer.c b/fs/bcachefs/btree_write_buffer.c
> index 80f4b9839bc2..32f20e631de0 100644
> --- a/fs/bcachefs/btree_write_buffer.c
> +++ b/fs/bcachefs/btree_write_buffer.c
...
> @@ -198,23 +203,19 @@ int __bch2_btree_write_buffer_flush(struct btree_trans *trans, unsigned commit_f
>  slowpath:
>  	trace_write_buffer_flush_slowpath(trans, i - keys, nr);
>  
> -	dst = keys;
> -	for (; i < keys + nr; i++) {
> -		if (i + 1 < keys + nr &&
> -		    i[0].btree == i[1].btree &&
> -		    bpos_eq(i[0].k.k.p, i[1].k.k.p))
> -			continue;
> -
> -		*dst = *i;
> -		dst++;
> -	}
> -	nr = dst - keys;
> -

This one leaves an unused var warning. I've fixed it locally but will
wait for any further comments before reposting.

Brian

> +	/*
> +	 * Now sort the rest by journal seq and bump the journal pin as we go.
> +	 * The slowpath zapped the seq of keys that were successfully flushed so
> +	 * we can skip those here.
> +	 */
>  	sort(keys, nr, sizeof(keys[0]),
>  	     btree_write_buffered_journal_cmp,
>  	     NULL);
>  
>  	for (i = keys; i < keys + nr; i++) {
> +		if (!i->journal_seq)
> +			continue;
> +
>  		if (i->journal_seq > pin.seq) {
>  			struct journal_entry_pin pin2;
>  
> -- 
> 2.39.2
> 


  reply	other threads:[~2023-03-21 13:39 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-21 13:20 [PATCH 0/5] bcachefs: journal stall fixes Brian Foster
2023-03-21 13:20 ` [PATCH 1/5] bcachefs: more aggressive fast path write buffer key flushing Brian Foster
2023-03-21 13:40   ` Brian Foster [this message]
2023-03-21 13:20 ` [PATCH 2/5] bcachefs: gracefully unwind journal res slowpath on shutdown Brian Foster
2023-03-21 13:20 ` [PATCH 3/5] bcachefs: refactor journal stuck checking into standalone helper Brian Foster
2023-03-21 13:20 ` [PATCH 4/5] bcachefs: drop unnecessary journal stuck check from space calculation Brian Foster
2023-03-21 13:20 ` [PATCH 5/5] RFC: bcachefs: use a timeout for the journal stuck condition Brian Foster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZBmzunR3GYZw8Tkl@bfoster \
    --to=bfoster@redhat.com \
    --cc=linux-bcachefs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.