Linux Btrfs filesystem development
 help / color / mirror / Atom feed
From: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
To: Pedro Macedo <pmacedo@pmacedo.com>
Cc: Anand Jain <anand.jain@oracle.com>,
	Roman Mamedov <rm@romanrm.net>, Remi Gauvin <remi@georgianit.com>,
	"linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>
Subject: Re: Balance on 5-disk RAID1 put all data on 2 disks, leaving the rest empty
Date: Wed, 1 Nov 2023 22:13:17 -0400	[thread overview]
Message-ID: <ZUMFvRDAragUzhlY@hungrycats.org> (raw)
In-Reply-To: <21245ede-7ef3-40ad-828f-91f6845e9273@pmacedo.com>

On Wed, Nov 01, 2023 at 08:20:56PM +0100, Pedro Macedo wrote:
> 
> On 27.10.23 06:21, Anand Jain wrote:
> > On 10/26/23 05:15, Roman Mamedov wrote:
> > > On Wed, 25 Oct 2023 17:08:08 -0400
> > > Remi Gauvin <remi@georgianit.com> wrote:
> > > 
> > > > On 2023-10-25 4:29 p.m., Peter Wedder wrote:
> > > > > Hello,
> > > > > 
> > > > > I had a RAID1 array on top of 4x4TB drives. Recently I
> > > > > removed one 4TB drive and added two 16TB drives to it. After
> > > > > running a full, unfiltered balance on the array, I am left
> > > > > in a situation where all the 4TB drives are completely
> > > > > empty, and all the data and metadata is on the 16TB drives.
> > > > > Is this normal? I was expecting to have at least some data
> > > > > on the smaller drives.
> > > > > 
> > > > 
> > > > Yes, this is normal.  The BTRFS allocates space in drives with the the
> > > > most available free space.  The idea is to balance the 'unallocated'
> > > > space on each drive, so they can be filled evenly.  The 4TB drives will
> > > > be used when the 16TB dives have less than 4TB unallocated.
> > > 
> > 
> > Correct. That's the only allocation method we have at the moment. Do you
> > have any feedback on whether there are any other allocation methods that
> > make sense?
> 
> 
> IMHO, based on the frequency of this question appearing here/on reddit/other
> sites, perhaps allocation by absolute space used?  It should fit the
> expectations of most folks that if you have free space on a disk it will be
> utilised, plus has potential performance implications by always using as
> many devices as possible to write to as long as they have any space left.

That is how allocation works with striped profiles:  chunks are allocated
using space from all non-full drives, in order to use space and iops
optimally.

For a non-striped profile like raid1, it's not possible to use all the
space without filling the larger devices first.  As the large devices
fill up, their free space becomes equal in size to the smaller devices,
and it's always possible to completely fill a raid1 array of equal-sized
devices.  If raid1 distributed data across the small devices at the same
time as the large devices, it would run out of space on small devices
before running out of space on the large ones, so significant space on
some devices would be wasted.

In some cases you really do want the data distributed across all the small
devices first, even though some of the space can't be used at first.
e.g. you plan to replace the small devices with larger ones later,
and you don't want to have to do an expensive balance operation each
time you replace a small device with a large one.  In that case, you
can use 'btrfs fi resize' to set the larger devices to the same size
as the smaller devices.  That will provide the equal filling of the
devices you want as the small devices fill up, and it will run out of
space when the only free space remaining is all on the large device.
Before that happens, you can replace the small devices with larger ones,
resize all the devices to the same size as the large one, and fill the
devices equally until all available space is used.  You'd have to manage
the device sizes yourself, because there's no way btrfs could guess you
planned to do this in advance.



> Regards,
> 
> Pedro
> 
> 
> > 
> > Thanks, Anand
> > 
> > > Interesting question and resolution. I'd be surprised by that as well.
> > > 
> > > Now, a great chance to "btrfs dev delete" all three remaining 4TB
> > > drives and
> > > unplug them for the time being, to save on noise, heat and power
> > > consumption!
> > 

  reply	other threads:[~2023-11-02  2:13 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-25 20:29 Balance on 5-disk RAID1 put all data on 2 disks, leaving the rest empty Peter Wedder
2023-10-25 21:08 ` Remi Gauvin
2023-10-25 21:15   ` Roman Mamedov
2023-10-27  4:21     ` Anand Jain
2023-11-01 19:20       ` Pedro Macedo
2023-11-02  2:13         ` Zygo Blaxell [this message]
2023-11-02  5:11           ` Paul Jones
2023-11-02 13:50             ` Zygo Blaxell
2023-11-02 23:57               ` waxhead

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZUMFvRDAragUzhlY@hungrycats.org \
    --to=ce3g8jdj@umail.furryterror.org \
    --cc=anand.jain@oracle.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=pmacedo@pmacedo.com \
    --cc=remi@georgianit.com \
    --cc=rm@romanrm.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox