public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Josef Bacik <josef@toxicpanda.com>
To: Stefan Roesch <shr@fb.com>
Cc: linux-btrfs@vger.kernel.org, kernel-team@fb.com
Subject: Re: [PATCH v2 0/4] btrfs: sysfs: set / query btrfs stripe size
Date: Thu, 28 Oct 2021 10:27:38 -0400	[thread overview]
Message-ID: <YXqzWv7t77ZpKIig@localhost.localdomain> (raw)
In-Reply-To: <YXqpFxiAVrC92io6@localhost.localdomain>

On Thu, Oct 28, 2021 at 09:43:51AM -0400, Josef Bacik wrote:
> On Wed, Oct 27, 2021 at 01:14:37PM -0700, Stefan Roesch wrote:
> > Motivation:
> > The btrfs allocator is currently not ideal for all workloads. It tends
> > to suffer from overallocating data block groups and underallocating
> > metadata block groups. This results in filesystems becoming read-only
> > even though there is plenty of "free" space.
> > 
> > This is naturally confusing and distressing to users.
> > 
> > Patches:
> > 1) Store the stripe and chunk size in the btrfs_space_info structure
> > 2) Add a sysfs entry to expose the above information
> > 3) Add a sysfs entry to force a space allocation
> > 4) Increase the default size of the metadata chunk allocation to 5GB
> >    for volumes greater than 50GB.
> > 
> > Testing:
> >   A new test is being added to the xfstest suite. For reference the
> >   corresponding patch has the title:
> >     [PATCH] btrfs: Test chunk allocation with different sizes
> > 
> >   In addition also manual testing has been performed.
> >     - Run xfstests with the changes and the new test. It does not
> >       show new diffs.
> >     - Test with storage devices 10G, 20G, 30G, 50G, 60G
> >       - Default allocation
> >       - Increase of chunk size
> >       - If the stripe size is > the free space, it allocates
> >         free space - 1MB. The 1MB is left as free space.
> >       - If the device has a storage size > 50G, it uses a 5GB
> >         chunk size for new allocations.
> > 
> > Stefan Roesch (4):
> >   btrfs: store stripe size and chunk size in space-info struct.
> >   btrfs: expose stripe and chunk size in sysfs.
> >   btrfs: add force_chunk_alloc sysfs entry to force allocation
> >   btrfs: increase metadata alloc size to 5GB for volumes > 50GB
> >
> 
> Sorry, I had this thought previously but it got lost when I started doing the
> actual code review.
> 
> We have conflated stripe size and chunk size here, and unfortunately "stripe
> size" means different things to different people.  What you are actually trying
> to do here is to allow us to allocate a larger logical chunk size.
> 
> In terms of how this works out in the code you are changing the correct thing,
> generally the stripe_size is what dictates the actual block group chunk size we
> end up with at the end.
> 
> But this is sort of confusing when it comes to the interface, because people are
> going to think it means something different.
> 
> Instead we should name the sysfs file chunk_size, and then keep the code you
> have the way it is, just with the new name.  That way it's clear to the user
> that they're changing how large of a chunk we're allocating at any given time.
> 
> Make that change, and I have a few other code comments, and then that should be
> good.  Thanks,
> 

In fact I talked about this with Johannes just now.  We sort of conflate the two
things, max_chunk_size and max_stripe_size, to get the answer we want.  But
these aren't well named and don't really behave in a way you'd expect.

Currently, we set max_stripe_size to make sure we clamp down on any dev extents
we find.  So if the whole disk is free we clearly don't want to allocate the
whole thing, so we clamp it down to max_stripe_size.  This, in effect, ends up
being our actual chunk_size.  We have this max_chunk_size thing but it doesn't
really do anything in practice because our stripe_size is already clamped down
so it'll be <= max_chunk_size.

All this is to say we should simply set max_stripe_size = max_chunk_size, but
call max_chunk_size default_chunk_size, because that's really what it is.  So
you should

1) Change the sysfs file to be chunk_size or something similar.
2) Don't expose stripe_size via sysfs, it's just a function of chunk_size.
3) Set stripe_size == chunk_size.
4) Get rid of the max_chunk_size logic, it's unneeded.

I think that's the proper way to deal with everything, if there are any corners
I'm missing then feel free to point them out, but I'm pretty sure 1-3 are
correct.  Thanks,

Josef

  reply	other threads:[~2021-10-28 14:27 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-27 20:14 [PATCH v2 0/4] btrfs: sysfs: set / query btrfs stripe size Stefan Roesch
2021-10-27 20:14 ` [PATCH v2 1/4] btrfs: store stripe size and chunk size in space-info struct Stefan Roesch
2021-10-28 13:48   ` Josef Bacik
2021-10-27 20:14 ` [PATCH v2 2/4] btrfs: expose stripe and chunk size in sysfs Stefan Roesch
2021-10-27 20:14 ` [PATCH v2 3/4] btrfs: add force_chunk_alloc sysfs entry to force allocation Stefan Roesch
2021-10-27 20:14 ` [PATCH v2 4/4] btrfs: increase metadata alloc size to 5GB for volumes > 50GB Stefan Roesch
2021-10-28  1:14   ` Wang Yugui
2021-10-28 13:43 ` [PATCH v2 0/4] btrfs: sysfs: set / query btrfs stripe size Josef Bacik
2021-10-28 14:27   ` Josef Bacik [this message]
2021-10-28 15:00     ` Johannes Thumshirn
2021-10-29  3:11       ` Stefan Roesch
2021-10-29  8:31         ` Johannes Thumshirn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YXqzWv7t77ZpKIig@localhost.localdomain \
    --to=josef@toxicpanda.com \
    --cc=kernel-team@fb.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=shr@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox