btrfs balance to add new drive taking ~60 hours, no progress?

Linux Btrfs filesystem development
 help / color / mirror / Atom feed

* btrfs balance to add new drive taking ~60 hours, no progress?
@ 2020-03-01 20:32 Rich Rauenzahn
  2020-03-02  1:10 ` Chris Murphy
  2020-03-02  1:48 ` Qu Wenruo
  0 siblings, 2 replies; 11+ messages in thread
From: Rich Rauenzahn @ 2020-03-01 20:32 UTC (permalink / raw)
  To: Btrfs BTRFS

(Is this just taking really long because I didn't provide filters when
balancing across the new drive?)

Also, I DID just change my /etc/fstab to not resume the balance just
in case I reboot:

/.BACKUPS               btrfs   compress=lzo,subvol=.BACKUPS,skip_balance   1 2

Kernel version:

Kernel:  5.5.5-1.el7.elrepo.x86_64

The pool is mirrored, 2 copies.

The last drive in the list is the one I added.  I think it's been at
8MiB the whole time.

$ sudo btrfs fi show /.BACKUPS/
Label: 'BACKUPS'  uuid: cfd65dcd-2a63-4fb1-89a7-0bb9ebe66ddf
        Total devices 4 FS bytes used 3.64TiB
        devid    2 size 1.82TiB used 1.82TiB path /dev/sda1
        devid    3 size 1.82TiB used 1.82TiB path /dev/sdc1
        devid    4 size 3.64TiB used 3.64TiB path /dev/sdb1
        devid    5 size 3.64TiB used 8.31MiB path /dev/sdj1

$ sudo btrfs fi df /.BACKUPS/
Data, RAID1: total=3.63TiB, used=3.63TiB
System, RAID1: total=32.00MiB, used=736.00KiB
Metadata, RAID1: total=5.00GiB, used=3.88GiB
GlobalReserve, single: total=512.00MiB, used=0.00B

$ btrfs fi usage /.BACKUPS/
WARNING: cannot read detailed chunk info, RAID5/6 numbers will be
incorrect, run as root
Overall:
    Device size:                  10.92TiB
    Device allocated:              7.28TiB
    Device unallocated:            3.64TiB
    Device missing:               10.92TiB
    Used:                          7.27TiB
    Free (estimated):              1.82TiB      (min: 1.82TiB)
    Data ratio:                       2.00
    Metadata ratio:                   2.00
    Global reserve:              512.00MiB      (used: 0.00B)

$ sudo btrfs fi usage /.BACKUPS/
Overall:
    Device size:                  10.92TiB
    Device allocated:              7.28TiB
    Device unallocated:            3.64TiB
    Device missing:                  0.00B
    Used:                          7.27TiB
    Free (estimated):              1.82TiB      (min: 1.82TiB)
    Data ratio:                       2.00
    Metadata ratio:                   2.00
    Global reserve:              512.00MiB      (used: 0.00B)

Data,RAID1: Size:3.63TiB, Used:3.63TiB
   /dev/sda1       1.82TiB
   /dev/sdb1       3.63TiB
   /dev/sdc1       1.82TiB
   /dev/sdj1       8.31MiB

Metadata,RAID1: Size:5.00GiB, Used:3.88GiB
   /dev/sda1       3.00GiB
   /dev/sdb1       5.00GiB
   /dev/sdc1       2.00GiB

System,RAID1: Size:32.00MiB, Used:736.00KiB
   /dev/sda1      32.00MiB
   /dev/sdb1      32.00MiB

Unallocated:
   /dev/sda1       1.00MiB
   /dev/sdb1       1.00MiB
   /dev/sdc1       1.00MiB
   /dev/sdj1       3.64TiB


Processes (I also tried a cancel, which is just hung as well)

4 S root      3665     1  0  80   0 - 60315 -      06:45 ?
00:00:00 sudo btrfs balance cancel /.BACKUPS/
4 D root      3666  3665  0  80   0 -  3983 -      06:45 ?
00:00:00 btrfs balance cancel /.BACKUPS/
4 S root     14035     1  0  80   0 - 60315 -      Feb28 ?
00:00:00 sudo btrfs filesystem balance /.BACKUPS/
4 D root     14036 14035  2  80   0 -  3984 -      Feb28 ?
00:59:12 btrfs filesystem balance /.BACKUPS/

All four drives ARE blinking, and the process takes <10% CPU, but > 0%.

2.6%:

14036 root      20   0   15936    656    520 D   2.6  0.0  59:13.90
btrfs filesystem balance /.BACKUPS/

df, while probably misleading with btrfs:

Filesystem      1K-blocks       Used  Available Use% Mounted on
/dev/sda1      5860531080 3906340128        384 100% /.BACKUPS

dmesg has a lot of these, and you can see they are issued pretty quickly:

[773986.367090] BTRFS info (device sda1): found 472 extents
[773986.583133] BTRFS info (device sda1): found 472 extents
[773986.799169] BTRFS info (device sda1): found 472 extents

sar output of relevant drives (10 secs):

10:26:23 AM       DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz
avgqu-sz     await     svctm     %util
10:26:26 AM       sdb     78.45      0.00   2312.37     29.48
0.48      6.64      0.58      4.52
10:26:26 AM       sda     78.80      0.00   2312.37     29.35
0.94     12.53      0.53      4.20
10:26:26 AM       sdc     36.40      0.00    220.49      6.06
0.25      7.24      0.85      3.11
10:26:26 AM       sdj     36.40      0.00    220.49      6.06
0.23      6.74      0.83      3.04

$ sudo btrfs balance status -v /.BACKUPS/
Balance on '/.BACKUPS/' is running, cancel requested
0 out of about 3733 chunks balanced (29 considered), 100% left
Dumping filters: flags 0x7, state 0x5, force is off
  DATA (flags 0x0): balancing
  METADATA (flags 0x0): balancing
  SYSTEM (flags 0x0): balancing

Oh, and the drive does think it is out of space even though the drive
has been added:

$ dd if=/dev/random of=random
dd: writing to ‘random’: No space left on device
0+7 records in
0+0 records out
0 bytes (0 B) copied, 0.341074 s, 0.0 kB/s

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: btrfs balance to add new drive taking ~60 hours, no progress?
  2020-03-01 20:32 btrfs balance to add new drive taking ~60 hours, no progress? Rich Rauenzahn
@ 2020-03-02  1:10 ` Chris Murphy
  2020-03-02  1:50   ` Rich Rauenzahn
       [not found]   ` <CAG+QAKUzqdVf88G9ZdLKLa3YUQRcvJMS47qQkhLsgiQ46R19Bw@mail.gmail.com>
  2020-03-02  1:48 ` Qu Wenruo
  1 sibling, 2 replies; 11+ messages in thread
From: Chris Murphy @ 2020-03-02  1:10 UTC (permalink / raw)
  To: Rich Rauenzahn; +Cc: Btrfs BTRFS

On Sun, Mar 1, 2020 at 1:32 PM Rich Rauenzahn <rrauenza@gmail.com> wrote:
>
> (Is this just taking really long because I didn't provide filters when
> balancing across the new drive?)

I don't think so. It might be fairly wedged in because it has no
unallocated space on 3 of 4 drives, and is writing into already
allocated block groups.

I think the mistake was adding only one new drive instead of two *and*
then also doing a balance.

I also think it's possible there's a bug, where Btrfs is trying too
hard to avoid ENOSPC. Ironic if true. It should just give up, or at
least it should cancel faster.

>
> $ sudo btrfs fi show /.BACKUPS/
> Label: 'BACKUPS'  uuid: cfd65dcd-2a63-4fb1-89a7-0bb9ebe66ddf
>         Total devices 4 FS bytes used 3.64TiB
>         devid    2 size 1.82TiB used 1.82TiB path /dev/sda1
>         devid    3 size 1.82TiB used 1.82TiB path /dev/sdc1
>         devid    4 size 3.64TiB used 3.64TiB path /dev/sdb1
>         devid    5 size 3.64TiB used 8.31MiB path /dev/sdj1

This suggests 3 of 4 are full.

> $ sudo btrfs fi usage /.BACKUPS/
> Overall:
>     Device size:                  10.92TiB
>     Device allocated:              7.28TiB
>     Device unallocated:            3.64TiB
>     Device missing:                  0.00B
>     Used:                          7.27TiB
>     Free (estimated):              1.82TiB      (min: 1.82TiB)
>     Data ratio:                       2.00
>     Metadata ratio:                   2.00
>     Global reserve:              512.00MiB      (used: 0.00B)
>
> Data,RAID1: Size:3.63TiB, Used:3.63TiB
>    /dev/sda1       1.82TiB
>    /dev/sdb1       3.63TiB
>    /dev/sdc1       1.82TiB
>    /dev/sdj1       8.31MiB
>
> Metadata,RAID1: Size:5.00GiB, Used:3.88GiB
>    /dev/sda1       3.00GiB
>    /dev/sdb1       5.00GiB
>    /dev/sdc1       2.00GiB
>
> System,RAID1: Size:32.00MiB, Used:736.00KiB
>    /dev/sda1      32.00MiB
>    /dev/sdb1      32.00MiB
>
> Unallocated:
>    /dev/sda1       1.00MiB
>    /dev/sdb1       1.00MiB
>    /dev/sdc1       1.00MiB
>    /dev/sdj1       3.64TiB

Free is 1.82 exactly half of  unallocated on one drive and no
unallocate on the other drives, so yeah this file system is 100% full.
Adding one drive was not enough, it's raid1. You needed to add two
drives.

So now what? The problem is you have a balance in-progress, and a
cancel in-progress, and I'm not sure which is less risky:

- add another device, even if it's small like a 32G partition or flash drive
- force reboot

What I *would* do before you do anything else is disable the write
cache on all the drives. At least that way if you have to force a
reboot, there's less of a chance COW and barrier guarantees can be
thwarted.

Be careful with hdparm, small w is dangerous, capital W is what you want.

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: btrfs balance to add new drive taking ~60 hours, no progress?
  2020-03-01 20:32 btrfs balance to add new drive taking ~60 hours, no progress? Rich Rauenzahn
  2020-03-02  1:10 ` Chris Murphy
@ 2020-03-02  1:48 ` Qu Wenruo
  1 sibling, 0 replies; 11+ messages in thread
From: Qu Wenruo @ 2020-03-02  1:48 UTC (permalink / raw)
  To: Rich Rauenzahn, Btrfs BTRFS


[-- Attachment #1.1: Type: text/plain, Size: 5523 bytes --]



On 2020/3/2 上午4:32, Rich Rauenzahn wrote:
> (Is this just taking really long because I didn't provide filters when
> balancing across the new drive?)
> 
> Also, I DID just change my /etc/fstab to not resume the balance just
> in case I reboot:
> 
> /.BACKUPS               btrfs   compress=lzo,subvol=.BACKUPS,skip_balance   1 2
> 
> Kernel version:
> 
> Kernel:  5.5.5-1.el7.elrepo.x86_64
> 
> The pool is mirrored, 2 copies.
> 
> The last drive in the list is the one I added.  I think it's been at
> 8MiB the whole time.
> 
> $ sudo btrfs fi show /.BACKUPS/
> Label: 'BACKUPS'  uuid: cfd65dcd-2a63-4fb1-89a7-0bb9ebe66ddf
>         Total devices 4 FS bytes used 3.64TiB
>         devid    2 size 1.82TiB used 1.82TiB path /dev/sda1
>         devid    3 size 1.82TiB used 1.82TiB path /dev/sdc1
>         devid    4 size 3.64TiB used 3.64TiB path /dev/sdb1
>         devid    5 size 3.64TiB used 8.31MiB path /dev/sdj1
> 
> $ sudo btrfs fi df /.BACKUPS/
> Data, RAID1: total=3.63TiB, used=3.63TiB
> System, RAID1: total=32.00MiB, used=736.00KiB
> Metadata, RAID1: total=5.00GiB, used=3.88GiB
> GlobalReserve, single: total=512.00MiB, used=0.00B
> 
> $ btrfs fi usage /.BACKUPS/
> WARNING: cannot read detailed chunk info, RAID5/6 numbers will be
> incorrect, run as root
> Overall:
>     Device size:                  10.92TiB
>     Device allocated:              7.28TiB
>     Device unallocated:            3.64TiB
>     Device missing:               10.92TiB
>     Used:                          7.27TiB
>     Free (estimated):              1.82TiB      (min: 1.82TiB)
>     Data ratio:                       2.00
>     Metadata ratio:                   2.00
>     Global reserve:              512.00MiB      (used: 0.00B)
> 
> $ sudo btrfs fi usage /.BACKUPS/
> Overall:
>     Device size:                  10.92TiB
>     Device allocated:              7.28TiB
>     Device unallocated:            3.64TiB
>     Device missing:                  0.00B
>     Used:                          7.27TiB
>     Free (estimated):              1.82TiB      (min: 1.82TiB)
>     Data ratio:                       2.00
>     Metadata ratio:                   2.00
>     Global reserve:              512.00MiB      (used: 0.00B)
> 
> Data,RAID1: Size:3.63TiB, Used:3.63TiB
>    /dev/sda1       1.82TiB
>    /dev/sdb1       3.63TiB
>    /dev/sdc1       1.82TiB
>    /dev/sdj1       8.31MiB
> 
> Metadata,RAID1: Size:5.00GiB, Used:3.88GiB
>    /dev/sda1       3.00GiB
>    /dev/sdb1       5.00GiB
>    /dev/sdc1       2.00GiB
> 
> System,RAID1: Size:32.00MiB, Used:736.00KiB
>    /dev/sda1      32.00MiB
>    /dev/sdb1      32.00MiB
> 
> Unallocated:
>    /dev/sda1       1.00MiB
>    /dev/sdb1       1.00MiB
>    /dev/sdc1       1.00MiB
>    /dev/sdj1       3.64TiB
> 
> 
> Processes (I also tried a cancel, which is just hung as well)
> 
> 4 S root      3665     1  0  80   0 - 60315 -      06:45 ?
> 00:00:00 sudo btrfs balance cancel /.BACKUPS/
> 4 D root      3666  3665  0  80   0 -  3983 -      06:45 ?
> 00:00:00 btrfs balance cancel /.BACKUPS/
> 4 S root     14035     1  0  80   0 - 60315 -      Feb28 ?
> 00:00:00 sudo btrfs filesystem balance /.BACKUPS/
> 4 D root     14036 14035  2  80   0 -  3984 -      Feb28 ?
> 00:59:12 btrfs filesystem balance /.BACKUPS/
> 
> All four drives ARE blinking, and the process takes <10% CPU, but > 0%.
> 
> 2.6%:
> 
> 14036 root      20   0   15936    656    520 D   2.6  0.0  59:13.90
> btrfs filesystem balance /.BACKUPS/
> 
> df, while probably misleading with btrfs:
> 
> Filesystem      1K-blocks       Used  Available Use% Mounted on
> /dev/sda1      5860531080 3906340128        384 100% /.BACKUPS
> 
> dmesg has a lot of these, and you can see they are issued pretty quickly:
> 
> [773986.367090] BTRFS info (device sda1): found 472 extents
> [773986.583133] BTRFS info (device sda1): found 472 extents
> [773986.799169] BTRFS info (device sda1): found 472 extents

That's a runaway balance.

> 
> sar output of relevant drives (10 secs):
> 
> 10:26:23 AM       DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz
> avgqu-sz     await     svctm     %util
> 10:26:26 AM       sdb     78.45      0.00   2312.37     29.48
> 0.48      6.64      0.58      4.52
> 10:26:26 AM       sda     78.80      0.00   2312.37     29.35
> 0.94     12.53      0.53      4.20
> 10:26:26 AM       sdc     36.40      0.00    220.49      6.06
> 0.25      7.24      0.85      3.11
> 10:26:26 AM       sdj     36.40      0.00    220.49      6.06
> 0.23      6.74      0.83      3.04
> 
> $ sudo btrfs balance status -v /.BACKUPS/
> Balance on '/.BACKUPS/' is running, cancel requested

And ironically, currently to hit a runaway balance, canceling is the
primary reason.

So to properly canceling the runaway balance, you need to apply the
latest quicker canceling patchset:
https://patchwork.kernel.org/project/linux-btrfs/list/?series=242357

Thanks,
Qu


> 0 out of about 3733 chunks balanced (29 considered), 100% left
> Dumping filters: flags 0x7, state 0x5, force is off
>   DATA (flags 0x0): balancing
>   METADATA (flags 0x0): balancing
>   SYSTEM (flags 0x0): balancing
> 
> Oh, and the drive does think it is out of space even though the drive
> has been added:
> 
> $ dd if=/dev/random of=random
> dd: writing to ‘random’: No space left on device
> 0+7 records in
> 0+0 records out
> 0 bytes (0 B) copied, 0.341074 s, 0.0 kB/s
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: btrfs balance to add new drive taking ~60 hours, no progress?
  2020-03-02  1:10 ` Chris Murphy
@ 2020-03-02  1:50   ` Rich Rauenzahn
       [not found]   ` <CAG+QAKUzqdVf88G9ZdLKLa3YUQRcvJMS47qQkhLsgiQ46R19Bw@mail.gmail.com>
  1 sibling, 0 replies; 11+ messages in thread
From: Rich Rauenzahn @ 2020-03-02  1:50 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS

[Resending because gmail on iphone didn't do plain text]

On Sun, Mar 1, 2020 at 5:10 PM Chris Murphy <lists@colorremedies.com> wrote:
>
> Free is 1.82 exactly half of  unallocated on one drive and no
> unallocate on the other drives, so yeah this file system is 100% full.
> Adding one drive was not enough, it's raid1. You needed to add two
> drives.

I'm not following the btrfs logic here - I had three drives, 2 x 1 TB
and a 1 x 4 TB and added a 4TB.

That was a total of 4TB in RAID0.  Wouldn't adding a fourth drive give
me 6TB and some of the blocks just moved from the three drives onto
the fourth during the rebalance?

Is there a particular 2nd copy policy I'm not aware of?

Or is it that it is trying to create new allocations on the new drive
as part of the balance but can't because they wouldn't be mirrored?
But I still don't get why it wouldn't move blocks from the full
drives...

>
> So now what? The problem is you have a balance in-progress, and a
> cancel in-progress, and I'm not sure which is less risky:
>
> - add another device, even if it's small like a 32G partition or flash drive
> - force reboot

I have a 150 GB of files I can remove ... I'll try that first.

Thank you for your help.

> What I *would* do before you do anything else is disable the write
> cache on all the drives. At least that way if you have to force a
> reboot, there's less of a chance COW and barrier guarantees can be
> thwarted.
>
> Be careful with hdparm, small w is dangerous, capital W is what you want.

Oh good idea!

>
>
> --
> Chris Murphy

On Sun, Mar 1, 2020 at 5:10 PM Chris Murphy <lists@colorremedies.com> wrote:
>
> On Sun, Mar 1, 2020 at 1:32 PM Rich Rauenzahn <rrauenza@gmail.com> wrote:
> >
> > (Is this just taking really long because I didn't provide filters when
> > balancing across the new drive?)
>
> I don't think so. It might be fairly wedged in because it has no
> unallocated space on 3 of 4 drives, and is writing into already
> allocated block groups.
>
> I think the mistake was adding only one new drive instead of two *and*
> then also doing a balance.
>
> I also think it's possible there's a bug, where Btrfs is trying too
> hard to avoid ENOSPC. Ironic if true. It should just give up, or at
> least it should cancel faster.
>
> >
> > $ sudo btrfs fi show /.BACKUPS/
> > Label: 'BACKUPS'  uuid: cfd65dcd-2a63-4fb1-89a7-0bb9ebe66ddf
> >         Total devices 4 FS bytes used 3.64TiB
> >         devid    2 size 1.82TiB used 1.82TiB path /dev/sda1
> >         devid    3 size 1.82TiB used 1.82TiB path /dev/sdc1
> >         devid    4 size 3.64TiB used 3.64TiB path /dev/sdb1
> >         devid    5 size 3.64TiB used 8.31MiB path /dev/sdj1
>
> This suggests 3 of 4 are full.
>
>
>
> > $ sudo btrfs fi usage /.BACKUPS/
> > Overall:
> >     Device size:                  10.92TiB
> >     Device allocated:              7.28TiB
> >     Device unallocated:            3.64TiB
> >     Device missing:                  0.00B
> >     Used:                          7.27TiB
> >     Free (estimated):              1.82TiB      (min: 1.82TiB)
> >     Data ratio:                       2.00
> >     Metadata ratio:                   2.00
> >     Global reserve:              512.00MiB      (used: 0.00B)
> >
> > Data,RAID1: Size:3.63TiB, Used:3.63TiB
> >    /dev/sda1       1.82TiB
> >    /dev/sdb1       3.63TiB
> >    /dev/sdc1       1.82TiB
> >    /dev/sdj1       8.31MiB
> >
> > Metadata,RAID1: Size:5.00GiB, Used:3.88GiB
> >    /dev/sda1       3.00GiB
> >    /dev/sdb1       5.00GiB
> >    /dev/sdc1       2.00GiB
> >
> > System,RAID1: Size:32.00MiB, Used:736.00KiB
> >    /dev/sda1      32.00MiB
> >    /dev/sdb1      32.00MiB
> >
> > Unallocated:
> >    /dev/sda1       1.00MiB
> >    /dev/sdb1       1.00MiB
> >    /dev/sdc1       1.00MiB
> >    /dev/sdj1       3.64TiB
>
> Free is 1.82 exactly half of  unallocated on one drive and no
> unallocate on the other drives, so yeah this file system is 100% full.
> Adding one drive was not enough, it's raid1. You needed to add two
> drives.
>
> So now what? The problem is you have a balance in-progress, and a
> cancel in-progress, and I'm not sure which is less risky:
>
> - add another device, even if it's small like a 32G partition or flash drive
> - force reboot
>
> What I *would* do before you do anything else is disable the write
> cache on all the drives. At least that way if you have to force a
> reboot, there's less of a chance COW and barrier guarantees can be
> thwarted.
>
> Be careful with hdparm, small w is dangerous, capital W is what you want.
>
> --
> Chris Murphy

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: btrfs balance to add new drive taking ~60 hours, no progress?
       [not found]   ` <CAG+QAKUzqdVf88G9ZdLKLa3YUQRcvJMS47qQkhLsgiQ46R19Bw@mail.gmail.com>
@ 2020-03-02  1:57     ` Chris Murphy
  2020-03-02  2:24       ` Rich Rauenzahn
  2020-03-02  2:38       ` Rich Rauenzahn
  0 siblings, 2 replies; 11+ messages in thread
From: Chris Murphy @ 2020-03-02  1:57 UTC (permalink / raw)
  To: Rich Rauenzahn; +Cc: Chris Murphy, Btrfs BTRFS

On Sun, Mar 1, 2020 at 6:26 PM Rich Rauenzahn <rrauenza@gmail.com> wrote:
>
>
>
> On Sun, Mar 1, 2020 at 5:10 PM Chris Murphy <lists@colorremedies.com> wrote:
>>
>> Free is 1.82 exactly half of  unallocated on one drive and no
>> unallocate on the other drives, so yeah this file system is 100% full.
>> Adding one drive was not enough, it's raid1. You needed to add two
>> drives.
>
>
> I'm not following the btrfs logic here - I had three drives, 2 x 1 TB and a 1 x 4 TB and added a 4TB.

The original three drives:

        devid    2 size 1.82TiB used 1.82TiB path /dev/sda1
        devid    3 size 1.82TiB used 1.82TiB path /dev/sdc1
        devid    4 size 3.64TiB used 3.64TiB path /dev/sdb1

Simplistically, devid 2 mirrors with 50% of devid 4, and devid 3
mirrors with the other 50% of devid 4. You have 4TB of data on an 8TB
volume in a raid1 configuration. That's completely full and using up
all space.

Then you added one drive. Doesn't matter what its size is. There's no
where for more data to go.

https://carfax.org.uk/btrfs-usage/

>
> That was a total of 4TB in RAID0.

The volume is 8TB in a RAID 1 before adding the 4th drive. You can put
4TB of data on that volume, and it will be full and balanced.

> Wouldn't adding a fourth drive give me 6TB and some of the blocks just moved from the three drives onto the fourth?

Adding one 4TB drive gives you one empty 4TB, and three full drives.
The first copy of a chunk can go to the new drive, but there's nowhere
for the 2nd copy to go because the other three drives are full.

> Is there a particular 2nd copy policy I'm not aware of?

Btrfs raid1 is not block based like either mdadm or LVM raid. It's
based on the block group. A data block group is typically 1GiB. When a
data block group has raid1 profile, it has two stripes. In this case a
stripe is a copy.  Using your devid 2, 3, 4 where 2 and 3 are the same
size and devid 4 is 2x you have three possible kinds of blockgroup
stripe combinations:

2+3
2+4
3+4

How many you have of each just depends on the life history of the
volume; but if it were never balance and all three drives were
together from the start,  you might in theory have only 2+4 and 3+4
block groups.

> Or is it that it is trying to create new allocations on the new drive as part of the balance but can't because they wouldn't be mirrored?

Correct. The balance must move "the block group" and there are
effectively two copies. You have room for one to be moved. Not two.

If there was even 10GiB unallocated on one of the drives, this would
be going a LOT faster. But it looks like the three drives were full
before you got started (?) so it's wedged itself in.

 >But I still don't get why it wouldn't move blocks from the full drives...

To where? There's only one drive with unallocated space.

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: btrfs balance to add new drive taking ~60 hours, no progress?
  2020-03-02  1:57     ` Chris Murphy
@ 2020-03-02  2:24       ` Rich Rauenzahn
  2020-03-02  2:38       ` Rich Rauenzahn
  1 sibling, 0 replies; 11+ messages in thread
From: Rich Rauenzahn @ 2020-03-02  2:24 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS

On Sun, Mar 1, 2020 at 5:57 PM Chris Murphy <lists@colorremedies.com> wrote:
> > I'm not following the btrfs logic here - I had three drives, 2 x 1 TB and a 1 x 4 TB and added a 4TB.
>
> The original three drives:
>
>         devid    2 size 1.82TiB used 1.82TiB path /dev/sda1
>         devid    3 size 1.82TiB used 1.82TiB path /dev/sdc1
>         devid    4 size 3.64TiB used 3.64TiB path /dev/sdb1
>
> Simplistically, devid 2 mirrors with 50% of devid 4, and devid 3
> mirrors with the other 50% of devid 4. You have 4TB of data on an 8TB
> volume in a raid1 configuration. That's completely full and using up
> all space.
>
> Then you added one drive. Doesn't matter what its size is. There's no
> where for more data to go.
>
> https://carfax.org.uk/btrfs-usage/

That shows I should have 6TB at RAID1, but only 4TB at RAID10.  I'm at
RAID1,  not RAID10.  (Although I'm not sure what the difference is
exactly in this context...):

$ sudo btrfs fi df /.BACKUPS/
Data, RAID1: total=3.63TiB, used=3.62TiB
System, RAID1: total=32.00MiB, used=736.00KiB
Metadata, RAID1: total=5.00GiB, used=3.87GiB
GlobalReserve, single: total=512.00MiB, used=0.00B

The web tool says:

Total space for files: 6000
Unusable: 0

Device 2 has size 4000
Device 3 has size 4000
Device 0 has size 2000
Device 1 has size 2000
Allocate 2 chunks at a time
Trivial bound is 12000 / 2 = 6000
q=0 bound is 8000 / 1 = 8000

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: btrfs balance to add new drive taking ~60 hours, no progress?
  2020-03-02  1:57     ` Chris Murphy
  2020-03-02  2:24       ` Rich Rauenzahn
@ 2020-03-02  2:38       ` Rich Rauenzahn
  2020-03-02  3:45         ` Rich Rauenzahn
  2020-03-02  3:59         ` Chris Murphy
  1 sibling, 2 replies; 11+ messages in thread
From: Rich Rauenzahn @ 2020-03-02  2:38 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS

On Sun, Mar 1, 2020 at 5:57 PM Chris Murphy <lists@colorremedies.com> wrote:
>  >But I still don't get why it wouldn't move blocks from the full drives...
>
> To where? There's only one drive with unallocated space.

...but that's what I'd expect the balance to do?  If Block (Chunk?) A
is on, say device 1 (4TB) and device 2 (2TB), why wouldn't it move
Block A to the new drive from device 1 or 2 in order to free up space
and balance/spread out usage across the drives?  Is that not what
balance's purpose is?   Or is free space required on 1 or 2 in order
to move the allocation to the new drive?

OH!  Just checked -- the balance finally cancelled after freeing up the 150GB.

Rich

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: btrfs balance to add new drive taking ~60 hours, no progress?
  2020-03-02  2:38       ` Rich Rauenzahn
@ 2020-03-02  3:45         ` Rich Rauenzahn
  2020-03-02  4:04           ` Chris Murphy
  2020-03-02  3:59         ` Chris Murphy
  1 sibling, 1 reply; 11+ messages in thread
From: Rich Rauenzahn @ 2020-03-02  3:45 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS

On Sun, Mar 1, 2020 at 6:38 PM Rich Rauenzahn <rrauenza@gmail.com> wrote:
> ...but that's what I'd expect the balance to do?  If Block (Chunk?) A
> is on, say device 1 (4TB) and device 2 (2TB), why wouldn't it move
> Block A to the new drive from device 1 or 2 in order to free up space
> and balance/spread out usage across the drives?  Is that not what
> balance's purpose is?   Or is free space required on 1 or 2 in order
> to move the allocation to the new drive?

Am I just not taking COW into account?  It rewrites the chunk to
reallocate so needs two destinations?  There is no "move"?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: btrfs balance to add new drive taking ~60 hours, no progress?
  2020-03-02  2:38       ` Rich Rauenzahn
  2020-03-02  3:45         ` Rich Rauenzahn
@ 2020-03-02  3:59         ` Chris Murphy
  1 sibling, 0 replies; 11+ messages in thread
From: Chris Murphy @ 2020-03-02  3:59 UTC (permalink / raw)
  To: Rich Rauenzahn; +Cc: Chris Murphy, Btrfs BTRFS

On Sun, Mar 1, 2020 at 7:38 PM Rich Rauenzahn <rrauenza@gmail.com> wrote:
>
> On Sun, Mar 1, 2020 at 5:57 PM Chris Murphy <lists@colorremedies.com> wrote:
> >  >But I still don't get why it wouldn't move blocks from the full drives...
> >
> > To where? There's only one drive with unallocated space.
>
> ...but that's what I'd expect the balance to do?  If Block (Chunk?) A
> is on, say device 1 (4TB) and device 2 (2TB), why wouldn't it move
> Block A to the new drive from device 1 or 2 in order to free up space
> and balance/spread out usage across the drives?

That isn't how balance works on Btrfs. To do a balance on raid1 it
means reading a 1GiB chunk, and writing 1GiB *into empty space* on
drive X and 1GiB *into empty space* on drive Y. And then only after
that succeeds is the original 1GiB chunk (1GiB each on two devices)
freed.

No such thing as move. Everything is a copy.

> OH!  Just checked -- the balance finally cancelled after freeing up the 150GB.

OK good. At this point it should have the head room to do the balance.
It still might be slower than it would be if it had say 25% free space
on the original three drives.

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: btrfs balance to add new drive taking ~60 hours, no progress?
  2020-03-02  3:45         ` Rich Rauenzahn
@ 2020-03-02  4:04           ` Chris Murphy
  2020-03-02  4:58             ` Rich Rauenzahn
  0 siblings, 1 reply; 11+ messages in thread
From: Chris Murphy @ 2020-03-02  4:04 UTC (permalink / raw)
  To: Rich Rauenzahn; +Cc: Chris Murphy, Btrfs BTRFS

On Sun, Mar 1, 2020 at 8:45 PM Rich Rauenzahn <rrauenza@gmail.com> wrote:
>
> On Sun, Mar 1, 2020 at 6:38 PM Rich Rauenzahn <rrauenza@gmail.com> wrote:
> > ...but that's what I'd expect the balance to do?  If Block (Chunk?) A
> > is on, say device 1 (4TB) and device 2 (2TB), why wouldn't it move
> > Block A to the new drive from device 1 or 2 in order to free up space
> > and balance/spread out usage across the drives?  Is that not what
> > balance's purpose is?   Or is free space required on 1 or 2 in order
> > to move the allocation to the new drive?
>
> Am I just not taking COW into account?  It rewrites the chunk to
> reallocate so needs two destinations?  There is no "move"?

Correct. The block group is a single unit, there isn't (yet) a short
cut to COW just one of the two copies to another device.


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: btrfs balance to add new drive taking ~60 hours, no progress?
  2020-03-02  4:04           ` Chris Murphy
@ 2020-03-02  4:58             ` Rich Rauenzahn
  0 siblings, 0 replies; 11+ messages in thread
From: Rich Rauenzahn @ 2020-03-02  4:58 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS

Ok, we're making balance progress now...  thank you for the help.  The
COW aspect makes sense regarding why the rebalance was stuck.

$ sudo btrfs fi usage /.BACKUPS/
Overall:
    Device size:                  10.92TiB
    Device allocated:              7.19TiB
    Device unallocated:            3.72TiB
    Device missing:                  0.00B
    Used:                          7.18TiB
    Free (estimated):              1.87TiB      (min: 1.87TiB)
    Data ratio:                       2.00
    Metadata ratio:                   2.00
    Global reserve:              512.00MiB      (used: 0.00B)

Data,RAID1: Size:3.59TiB, Used:3.58TiB
   /dev/sda1       1.78TiB
   /dev/sdb1       3.59TiB
   /dev/sdc1       1.79TiB
   /dev/sdj1      17.00GiB

Metadata,RAID1: Size:4.00GiB, Used:3.83GiB
   /dev/sda1       1.00GiB
   /dev/sdb1       4.00GiB
   /dev/sdc1       2.00GiB
   /dev/sdj1       1.00GiB

System,RAID1: Size:32.00MiB, Used:720.00KiB
   /dev/sdb1      32.00MiB
   /dev/sdj1      32.00MiB

Unallocated:
   /dev/sda1      38.07GiB
   /dev/sdb1      44.03GiB
   /dev/sdc1      24.00GiB
   /dev/sdj1       3.62TiB

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-03-02  4:58 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-03-01 20:32 btrfs balance to add new drive taking ~60 hours, no progress? Rich Rauenzahn
2020-03-02  1:10 ` Chris Murphy
2020-03-02  1:50   ` Rich Rauenzahn
     [not found]   ` <CAG+QAKUzqdVf88G9ZdLKLa3YUQRcvJMS47qQkhLsgiQ46R19Bw@mail.gmail.com>
2020-03-02  1:57     ` Chris Murphy
2020-03-02  2:24       ` Rich Rauenzahn
2020-03-02  2:38       ` Rich Rauenzahn
2020-03-02  3:45         ` Rich Rauenzahn
2020-03-02  4:04           ` Chris Murphy
2020-03-02  4:58             ` Rich Rauenzahn
2020-03-02  3:59         ` Chris Murphy
2020-03-02  1:48 ` Qu Wenruo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox