Re: [PATCH] btrfs: adjust overcommit logic when very close to full

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Josef Bacik <josef@toxicpanda.com>
To: David Sterba <dsterba@suse.cz>
Cc: linux-btrfs@vger.kernel.org, kernel-team@fb.com
Subject: Re: [PATCH] btrfs: adjust overcommit logic when very close to full
Date: Wed, 20 Sep 2023 10:01:47 -0400	[thread overview]
Message-ID: <20230920140147.GB3796940@perftesting> (raw)
In-Reply-To: <20230918212950.GP2747@twin.jikos.cz>

On Mon, Sep 18, 2023 at 11:29:50PM +0200, David Sterba wrote:
> On Mon, Sep 18, 2023 at 03:27:47PM -0400, Josef Bacik wrote:
> > A user reported some unpleasant behavior with very small file systems.
> > The reproducer is this
> > 
> > mkfs.btrfs -f -m single -b 8g /dev/vdb
> > mount /dev/vdb /mnt/test
> > dd if=/dev/zero of=/mnt/test/testfile bs=512M count=20
> > 
> > This will result in usage that looks like this
> > 
> > Overall:
> >     Device size:                   8.00GiB
> >     Device allocated:              8.00GiB
> >     Device unallocated:            1.00MiB
> >     Device missing:                  0.00B
> >     Device slack:                  2.00GiB
> >     Used:                          5.47GiB
> >     Free (estimated):              2.52GiB      (min: 2.52GiB)
> >     Free (statfs, df):               0.00B
> >     Data ratio:                       1.00
> >     Metadata ratio:                   1.00
> >     Global reserve:                5.50MiB      (used: 0.00B)
> >     Multiple profiles:                  no
> > 
> > Data,single: Size:7.99GiB, Used:5.46GiB (68.41%)
> >    /dev/vdb        7.99GiB
> > 
> > Metadata,single: Size:8.00MiB, Used:5.77MiB (72.07%)
> >    /dev/vdb        8.00MiB
> > 
> > System,single: Size:4.00MiB, Used:16.00KiB (0.39%)
> >    /dev/vdb        4.00MiB
> > 
> > Unallocated:
> >    /dev/vdb        1.00MiB
> > 
> > As you can see we've gotten ourselves quite full with metadata, with all
> > of the disk being allocated for data.
> > 
> > On smaller file systems there's not a lot of time before we get full, so
> > our overcommit behavior bites us here.  Generally speaking data
> > reservations result in chunk allocations as we assume reservation ==
> > actual use for data.  This means at any point we could end up with a
> > chunk allocation for data, and if we're very close to full we could do
> > this before we have a chance to figure out that we need another metadata
> > chunk.
> > 
> > Address this by adjusting the overcommit logic.  Simply put we need to
> > take away 1 chunk from the available chunk space in case of a data
> > reservation.  This will allow us to stop overcommitting before we
> > potentially lose this space to a data allocation.  With this fix in
> > place we properly allocate a metadata chunk before we're completely
> > full, allowing for enough slack space in metadata.
> > 
> > Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> > ---
> >  fs/btrfs/space-info.c | 17 +++++++++++++++++
> >  1 file changed, 17 insertions(+)
> > 
> > diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
> > index d7e8cd4f140c..7aa53058d893 100644
> > --- a/fs/btrfs/space-info.c
> > +++ b/fs/btrfs/space-info.c
> > @@ -365,6 +365,23 @@ static u64 calc_available_free_space(struct btrfs_fs_info *fs_info,
> >  	factor = btrfs_bg_type_to_factor(profile);
> >  	avail = div_u64(avail, factor);
> >  
> > +	/*
> > +	 * Since data allocations immediately use block groups as part of the
> > +	 * reservation, because we assume that data reservations will == actual
> > +	 * usage, we could potentially overcommit and then immediately have that
> > +	 * available space used by a data allocation, which could put us in a
> > +	 * bind when we get close to filling the file system.
> > +	 *
> > +	 * To handle this simply remove 1G (which is our current maximum chunk
> > +	 * allocation size) from the available space.  If we are relatively
> > +	 * empty this won't affect our ability to overcommit much, and if we're
> > +	 * very close to full it'll keep us from getting into a position where
> > +	 * we've given ourselves very little metadata wiggle room.
> > +	 */
> > +	if (avail < SZ_1G)
> > +		return 0;
> > +	avail -= SZ_1G;
> 
> Should the value be derived from the alloc_chunk_ctl::max_chunk_size or
> chunk_size? Or at least use a named constant, similar to
> BTRFS_MAX_DATA_CHUNK_SIZE .

I originally had this, but we still have this logic in the chunk allocator

	/* Stripe size should not go beyond 1G. */
	ctl->stripe_size = min_t(u64, ctl->stripe_size, SZ_1G);

max_chunk_size tends to be larger than this, BTRFS_MAX_DATA_CHUNK_SIZE is 10G.

We need to go back and clean up the logic to not have the SZ_1G here, rename it
to BTRFS_MAX_CHUNK_SIZE or something like that, and then update my patch to use
this.

I opted for the cleanest fix for now, the cleanups can come after.
Alternatively I can do the cleanups and do the fix, it's up to you.  Thanks,

Josef

next prev parent reply	other threads:[~2023-09-20 14:01 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-18 19:27 [PATCH] btrfs: adjust overcommit logic when very close to full Josef Bacik
2023-09-18 20:14 ` Boris Burkov
2023-09-20 13:59   ` Josef Bacik
2023-09-20 19:02     ` David Sterba
2023-09-18 21:29 ` David Sterba
2023-09-20 14:01   ` Josef Bacik [this message]
2023-09-20 14:04     ` David Sterba
2023-09-20 19:05 ` David Sterba
2023-09-21 22:50   ` Filipe Manana
2023-09-22 10:25     ` David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230920140147.GB3796940@perftesting \
    --to=josef@toxicpanda.com \
    --cc=dsterba@suse.cz \
    --cc=kernel-team@fb.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).