public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Eric Wong <e@80x24.org>
To: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
Cc: kreijack@inwind.it, linux-btrfs@vger.kernel.org
Subject: Re: adding new devices to degraded raid1
Date: Fri, 28 Aug 2020 02:34:12 +0000	[thread overview]
Message-ID: <20200828023412.GA308@dcvr> (raw)
In-Reply-To: <20200828003037.GU5890@hungrycats.org>

Zygo Blaxell <ce3g8jdj@umail.furryterror.org> wrote:
> Note that add/remove is orders of magnitude slower than replace.
> Replace might take hours or even a day or two on a huge spinning drive.
> Add/remove might take _months_, though if you have 8-year-old disks
> then it's probably a few days, weeks at most.

Btw, any explanation or profiling done on why remove is so much
slower than replace?  Especially since btrfs raid1 ought to be
fairly mature at this point (and I run recent stable kernels).

Converting a single drive to raid1 was not slow at all, either.
RAID 1 ought to be straightforward if there's plenty of free
space, one would think...

> Add/remove does work for raid1* (i.e. raid1, raid10, raid1c3, raid1c4).
> At the moment only 'replace' works reliably for raid5/raid6.

Noted, I'm staying far, far away from raid5/6 :)  Thanks for
your posts on that topic, by the way.

> On Thu, Aug 27, 2020 at 07:14:18PM +0200, Goffredo Baroncelli wrote:
> > Instead of
> > 
> >  	btrfs device remove broken /mnt/foo
> > 
> > You should do
> > 
> > 	btrfs device remove missing /mnt/foo
> > 
> > ("missing" has to be write as is, it is a special term, see man page)

Thanks Goffredo, noted.

> > and
> > 
> > 	btrfs balance start /mnt/foo
> 
> If the replacement disks are larger than half the size of the failed disk
> then device remove may do sufficient data relocation and you won't need
> balance.  Once all the disks have equal amounts of unallocated space in
> 'btrfs fi usage' you can cancel any balances that are running.
> 
> On the other hand, if the replacement disks are close to half the size
> of the failed disk, then some careful balance filtering is required in
> order to utilize all the available space.  This filtering is more than
> what the stock tool offers.  You have to make sure that there are no block
> groups with a mirror copy on both of the small disks, as any such block
> group removes 1GB of available mirror space for data on the largest disk.

Yikes, that balancing sounds like a pain.  I'm not super-limited
on space, and a fair bit gets overwritten or replaced as time
goes on, anyways.

I wonder how far I could get with some lossless rewrites which
might make sense, anyways.

1) full "git gc" (I have a fair amount of git repos)
   Maybe setting pack.compression=0 will even help dedupe
   similar repos (but they'll be no fun to serve over network)

2) replacing some manually-compressed files with uncompressed
   versions (let btrfs compression handle it).  I expect that'll
   let dedupe work better, too.

   I have a lot of FLAC that could live as uncompressed .sox
   files.  I expect FLAC to be more efficient on single files,
   but dedupe could save on cuts that are/were used for editing.
   I won't miss FLAC MD5 checksums when btrfs has checksums, either.

3) is this also something defrag can help with?

Thanks again.

  reply	other threads:[~2020-08-28  2:34 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-27 12:41 adding new devices to degraded raid1 Eric Wong
2020-08-27 17:14 ` Goffredo Baroncelli
2020-08-28  0:30   ` Zygo Blaxell
2020-08-28  2:34     ` Eric Wong [this message]
2020-08-28  4:36       ` Zygo Blaxell
2020-08-28  5:09         ` Andrei Borzenkov
2020-08-28 20:56           ` Zygo Blaxell
2020-08-29  0:42         ` Eric Wong
2020-08-29 18:46           ` Zygo Blaxell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200828023412.GA308@dcvr \
    --to=e@80x24.org \
    --cc=ce3g8jdj@umail.furryterror.org \
    --cc=kreijack@inwind.it \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox