Best Practice: Add new device to RAID1 pool

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Best Practice: Add new device to RAID1 pool
@ 2017-07-24 11:27 Cloud Admin
  2017-07-24 13:46 ` Austin S. Hemmelgarn
  2017-07-24 20:35 ` Best Practice: Add new device to RAID1 pool Chris Murphy
  0 siblings, 2 replies; 21+ messages in thread
From: Cloud Admin @ 2017-07-24 11:27 UTC (permalink / raw)
  To: linux-btrfs

Hi,
I have a multi-device pool (three discs) as RAID1. Now I want to add a
new disc to increase the pool. I followed the description on https://bt
rfs.wiki.kernel.org/index.php/Using_Btrfs_with_Multiple_Devices and
used 'btrfs add <device> <btrfs path>'. After that I called a balance
for rebalancing the RAID1 using 'btrfs balance start <btrfs path>'. 
Is that anything or should I need to call a resize (for example) or
anything else? Or do I need to specify filter/profile parameters for
balancing?
I am a little bit confused because the balance command is running since
12 hours and only 3GB of data are touched. This would mean the whole
balance process (new disc has 8TB) would run a long, long time... and
is using one cpu by 100%.
Thanks for your help and time.
Bye
	Frank


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Best Practice: Add new device to RAID1 pool
  2017-07-24 11:27 Best Practice: Add new device to RAID1 pool Cloud Admin
@ 2017-07-24 13:46 ` Austin S. Hemmelgarn
  2017-07-24 14:08   ` Roman Mamedov
  2017-07-24 14:12   ` Cloud Admin
  2017-07-24 20:35 ` Best Practice: Add new device to RAID1 pool Chris Murphy
  1 sibling, 2 replies; 21+ messages in thread
From: Austin S. Hemmelgarn @ 2017-07-24 13:46 UTC (permalink / raw)
  To: Cloud Admin, linux-btrfs

On 2017-07-24 07:27, Cloud Admin wrote:
> Hi,
> I have a multi-device pool (three discs) as RAID1. Now I want to add a
> new disc to increase the pool. I followed the description on https://bt
> rfs.wiki.kernel.org/index.php/Using_Btrfs_with_Multiple_Devices and
> used 'btrfs add <device> <btrfs path>'. After that I called a balance
> for rebalancing the RAID1 using 'btrfs balance start <btrfs path>'.
> Is that anything or should I need to call a resize (for example) or
> anything else? Or do I need to specify filter/profile parameters for
> balancing?
> I am a little bit confused because the balance command is running since
> 12 hours and only 3GB of data are touched. This would mean the whole
> balance process (new disc has 8TB) would run a long, long time... and
> is using one cpu by 100%.

Based on what you're saying, it sounds like you've either run into a 
bug, or have a huge number of snapshots on this filesystem.  What you 
described is exactly what you should be doing when expanding an array 
(add the device, then run a full balance).  The fact that it's taking 
this long isn't normal, unless you have very slow storage devices.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Best Practice: Add new device to RAID1 pool
  2017-07-24 13:46 ` Austin S. Hemmelgarn
@ 2017-07-24 14:08   ` Roman Mamedov
  2017-07-24 16:42     ` Cloud Admin
  2017-07-24 14:12   ` Cloud Admin
  1 sibling, 1 reply; 21+ messages in thread
From: Roman Mamedov @ 2017-07-24 14:08 UTC (permalink / raw)
  To: Austin S. Hemmelgarn; +Cc: Cloud Admin, linux-btrfs

On Mon, 24 Jul 2017 09:46:34 -0400
"Austin S. Hemmelgarn" <ahferroin7@gmail.com> wrote:

> > I am a little bit confused because the balance command is running since
> > 12 hours and only 3GB of data are touched. This would mean the whole
> > balance process (new disc has 8TB) would run a long, long time... and
> > is using one cpu by 100%.
> 
> Based on what you're saying, it sounds like you've either run into a 
> bug, or have a huge number of snapshots

...and possibly quotas (qgroups) enabled. (perhaps automatically by some tool,
and not by you). Try:

  btrfs quota disable <mountpoint>

With respect,
Roman

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Best Practice: Add new device to RAID1 pool
  2017-07-24 13:46 ` Austin S. Hemmelgarn
  2017-07-24 14:08   ` Roman Mamedov
@ 2017-07-24 14:12   ` Cloud Admin
  2017-07-24 14:25     ` Austin S. Hemmelgarn
  1 sibling, 1 reply; 21+ messages in thread
From: Cloud Admin @ 2017-07-24 14:12 UTC (permalink / raw)
  To: linux-btrfs

Am Montag, den 24.07.2017, 09:46 -0400 schrieb Austin S. Hemmelgarn:
> On 2017-07-24 07:27, Cloud Admin wrote:
> > Hi,
> > I have a multi-device pool (three discs) as RAID1. Now I want to
> > add a
> > new disc to increase the pool. I followed the description on https:
> > //bt
> > rfs.wiki.kernel.org/index.php/Using_Btrfs_with_Multiple_Devices and
> > used 'btrfs add <device> <btrfs path>'. After that I called a
> > balance
> > for rebalancing the RAID1 using 'btrfs balance start <btrfs path>'.
> > Is that anything or should I need to call a resize (for example) or
> > anything else? Or do I need to specify filter/profile parameters
> > for
> > balancing?
> > I am a little bit confused because the balance command is running
> > since
> > 12 hours and only 3GB of data are touched. This would mean the
> > whole
> > balance process (new disc has 8TB) would run a long, long time...
> > and
> > is using one cpu by 100%.
> 
> Based on what you're saying, it sounds like you've either run into a 
> bug, or have a huge number of snapshots on this filesystem.  

It depends what you define as huge. The call of 'btrfs sub list <btrfs
path>' returns a list of 255 subvolume.
I think this is not too huge. The most of this subvolumes was created
using docker itself. I cancel the balance (this will take awhile) and will try to delete such of these subvolumes/snapshots.
What can I do more?

> What you 
> described is exactly what you should be doing when expanding an
> array 
> (add the device, then run a full balance).  The fact that it's
> taking 
> this long isn't normal, unless you have very slow storage devices.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Best Practice: Add new device to RAID1 pool
  2017-07-24 14:12   ` Cloud Admin
@ 2017-07-24 14:25     ` Austin S. Hemmelgarn
  2017-07-24 16:40       ` Cloud Admin
  0 siblings, 1 reply; 21+ messages in thread
From: Austin S. Hemmelgarn @ 2017-07-24 14:25 UTC (permalink / raw)
  To: Cloud Admin, linux-btrfs

On 2017-07-24 10:12, Cloud Admin wrote:
> Am Montag, den 24.07.2017, 09:46 -0400 schrieb Austin S. Hemmelgarn:
>> On 2017-07-24 07:27, Cloud Admin wrote:
>>> Hi,
>>> I have a multi-device pool (three discs) as RAID1. Now I want to
>>> add a
>>> new disc to increase the pool. I followed the description on https:
>>> //bt
>>> rfs.wiki.kernel.org/index.php/Using_Btrfs_with_Multiple_Devices and
>>> used 'btrfs add <device> <btrfs path>'. After that I called a
>>> balance
>>> for rebalancing the RAID1 using 'btrfs balance start <btrfs path>'.
>>> Is that anything or should I need to call a resize (for example) or
>>> anything else? Or do I need to specify filter/profile parameters
>>> for
>>> balancing?
>>> I am a little bit confused because the balance command is running
>>> since
>>> 12 hours and only 3GB of data are touched. This would mean the
>>> whole
>>> balance process (new disc has 8TB) would run a long, long time...
>>> and
>>> is using one cpu by 100%.
>>
>> Based on what you're saying, it sounds like you've either run into a
>> bug, or have a huge number of snapshots on this filesystem.
> 
> It depends what you define as huge. The call of 'btrfs sub list <btrfs
> path>' returns a list of 255 subvolume.
OK, this isn't horrible, especially if most of them aren't snapshots 
(it's cross-subvolume reflinks that are most of the issue when it comes 
to snapshots, not the fact that they're subvolumes).
> I think this is not too huge. The most of this subvolumes was created
> using docker itself. I cancel the balance (this will take awhile) and will try to delete such of these subvolumes/snapshots.
> What can I do more?
As Roman mentioned in his reply, it may also be qgroup related.  If you run:
btrfs quota disable

On the filesystem in question, that may help too, and if you are using 
quotas, turning them off with that command will get you a much bigger 
performance improvement than removing all the snapshots.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Best Practice: Add new device to RAID1 pool
  2017-07-24 14:25     ` Austin S. Hemmelgarn
@ 2017-07-24 16:40       ` Cloud Admin
  2017-07-29 23:04         ` Best Practice: Add new device to RAID1 pool (Summary) Cloud Admin
  0 siblings, 1 reply; 21+ messages in thread
From: Cloud Admin @ 2017-07-24 16:40 UTC (permalink / raw)
  To: linux-btrfs

Am Montag, den 24.07.2017, 10:25 -0400 schrieb Austin S. Hemmelgarn:
> On 2017-07-24 10:12, Cloud Admin wrote:
> > Am Montag, den 24.07.2017, 09:46 -0400 schrieb Austin S.
> > Hemmelgarn:
> > > On 2017-07-24 07:27, Cloud Admin wrote:
> > > > Hi,
> > > > I have a multi-device pool (three discs) as RAID1. Now I want
> > > > to
> > > > add a
> > > > new disc to increase the pool. I followed the description on
> > > > https:
> > > > //bt
> > > > rfs.wiki.kernel.org/index.php/Using_Btrfs_with_Multiple_Devices
> > > > and
> > > > used 'btrfs add <device> <btrfs path>'. After that I called a
> > > > balance
> > > > for rebalancing the RAID1 using 'btrfs balance start <btrfs
> > > > path>'.
> > > > Is that anything or should I need to call a resize (for
> > > > example) or
> > > > anything else? Or do I need to specify filter/profile
> > > > parameters
> > > > for
> > > > balancing?
> > > > I am a little bit confused because the balance command is
> > > > running
> > > > since
> > > > 12 hours and only 3GB of data are touched. This would mean the
> > > > whole
> > > > balance process (new disc has 8TB) would run a long, long
> > > > time...
> > > > and
> > > > is using one cpu by 100%.
> > > 
> > > Based on what you're saying, it sounds like you've either run
> > > into a
> > > bug, or have a huge number of snapshots on this filesystem.
> > 
> > It depends what you define as huge. The call of 'btrfs sub list
> > <btrfs
> > path>' returns a list of 255 subvolume.
> 
> OK, this isn't horrible, especially if most of them aren't snapshots 
> (it's cross-subvolume reflinks that are most of the issue when it
> comes 
> to snapshots, not the fact that they're subvolumes).
> > I think this is not too huge. The most of this subvolumes was
> > created
> > using docker itself. I cancel the balance (this will take awhile)
> > and will try to delete such of these subvolumes/snapshots.
> > What can I do more?
> 
> As Roman mentioned in his reply, it may also be qgroup related.  If
> you run:
> btrfs quota disable
It seems quota was one part of it. Thanks for the tip. I disabled and
started balance new.
Now approx. each 5 min. one chunk will be relocated. But if I take the
reported 10860 chunks and calc. the time it will take ~37 days to
finish... So, it seems I have to investigate more time into figure out
the subvolume / snapshots structure created by docker.
A first deeper look shows, there is a subvolume with a snapshot, which
has itself a snapshot, and so forth.
> 
> On the filesystem in question, that may help too, and if you are
> using 
> quotas, turning them off with that command will get you a much
> bigger 
> performance improvement than removing all the snapshots.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-
> btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Best Practice: Add new device to RAID1 pool
  2017-07-24 14:08   ` Roman Mamedov
@ 2017-07-24 16:42     ` Cloud Admin
  0 siblings, 0 replies; 21+ messages in thread
From: Cloud Admin @ 2017-07-24 16:42 UTC (permalink / raw)
  To: linux-btrfs

Am Montag, den 24.07.2017, 19:08 +0500 schrieb Roman Mamedov:
> On Mon, 24 Jul 2017 09:46:34 -0400
> "Austin S. Hemmelgarn" <ahferroin7@gmail.com> wrote:
> 
> > > I am a little bit confused because the balance command is running
> > > since
> > > 12 hours and only 3GB of data are touched. This would mean the
> > > whole
> > > balance process (new disc has 8TB) would run a long, long time...
> > > and
> > > is using one cpu by 100%.
> > 
> > Based on what you're saying, it sounds like you've either run into
> > a 
> > bug, or have a huge number of snapshots
> 
> ...and possibly quotas (qgroups) enabled. (perhaps automatically by
> some tool,
> and not by you). Try:
> 
>   btrfs quota disable <mountpoint>
> 
It seems this was one part of my problem. See my answer to Austin.
> 
With respect,
> Roman
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Best Practice: Add new device to RAID1 pool
  2017-07-24 11:27 Best Practice: Add new device to RAID1 pool Cloud Admin
  2017-07-24 13:46 ` Austin S. Hemmelgarn
@ 2017-07-24 20:35 ` Chris Murphy
  2017-07-24 20:42   ` Hugo Mills
  2017-07-24 21:12   ` waxhead
  1 sibling, 2 replies; 21+ messages in thread
From: Chris Murphy @ 2017-07-24 20:35 UTC (permalink / raw)
  To: Cloud Admin; +Cc: Btrfs BTRFS

On Mon, Jul 24, 2017 at 5:27 AM, Cloud Admin <admin@cloud.haefemeier.eu> wrote:

> I am a little bit confused because the balance command is running since
> 12 hours and only 3GB of data are touched.

That's incredibly slow. Something isn't right.

Using btrfs-debug -b from btrfs-progs, I've selected a few 100% full chunks.

[156777.077378] f26s.localdomain sudo[13757]:    chris : TTY=pts/2 ;
PWD=/home/chris ; USER=root ; COMMAND=/sbin/btrfs balance start
-dvrange=157970071552..159043813376 /
[156773.328606] f26s.localdomain kernel: BTRFS info (device sda1):
relocating block group 157970071552 flags data
[156800.408918] f26s.localdomain kernel: BTRFS info (device sda1):
found 38952 extents
[156861.343067] f26s.localdomain kernel: BTRFS info (device sda1):
found 38951 extents

That 1GiB chunk with quite a few fragments took 88s. That's 11MB/s.
Even for a hard drive, that's slow. I've got maybe a dozen snapshots
on this particular volume and quotas are not enabled. By definition
all of those extents are sequential. So I'm not sure why it's taking
so long. Seems almost like a regression somewhere. A nearby chunk with
~23k extents only takes 45s to balance. And another chunk with ~32000
extents took 55s to balance.

4.11.10-300.fc26.x86_64
btrfs-progs-4.11.1-1.fc27.x86_64

But what you are experiencing is a orders of magnitude worse than what
I'm experiencing. What kernel and progs are you using?

Track down debug-btrfs in the root of
https://github.com/kdave/btrfs-progs  and point it to the mounted
volume with -b so something like
sudo btrfs-debug -b /srv/scratch

Also do you have any output from kernel messages like above "relocated
block group" and "found extents" how many extents in these bg's that
have been relocated?

I don't know if this is a helpful comparison but I'm finding 'btrfs
inspect-internal tree-stats'.

[chris@f26s ~]$ sudo btrfs inspect tree-stats /dev/sda1
WARNING: /dev/sda1 already mounted, results may be inaccurate
Calculating size of root tree
    Total size: 64.00KiB
        Inline data: 0.00B
    Total seeks: 3
        Forward seeks: 2
        Backward seeks: 1
        Avg seek len: 4.82MiB
    Total clusters: 1
        Avg cluster size: 0.00B
        Min cluster size: 0.00B
        Max cluster size: 16.00KiB
    Total disk spread: 6.50MiB
    Total read time: 0 s 3 us
    Levels: 2
Calculating size of extent tree
    Total size: 63.03MiB
        Inline data: 0.00B
    Total seeks: 3613
        Forward seeks: 1801
        Backward seeks: 1812
        Avg seek len: 15.19GiB
    Seek histogram
              16384 -      147456:         546 ###
             180224 -     5554176:         540 ###
            5718016 -    22200320:         540 ###
           22265856 -    96534528:         540 ###
           96616448 - 47356215296:         540 ###
        47357067264 - 64038076416:         540 ###
        64038371328 - 64525729792:         346 #
    Total clusters: 295
        Avg cluster size: 38.78KiB
        Min cluster size: 32.00KiB
        Max cluster size: 128.00KiB
    Total disk spread: 60.12GiB
    Total read time: 0 s 1338 us
    Levels: 3
Calculating size of csum tree
    Total size: 67.44MiB
        Inline data: 0.00B
    Total seeks: 3368
        Forward seeks: 2167
        Backward seeks: 1201
        Avg seek len: 12.95GiB
    Seek histogram
              16384 -       65536:         532 ###
              98304 -      720896:         504 ###
             753664 -    37404672:         504 ###
           38125568 -   215547904:         504 ###
          216481792 - 47522119680:         504 ###
        47522430976 - 63503482880:         505 ###
        63508348928 - 64503119872:         267 #
    Total clusters: 389
        Avg cluster size: 54.75KiB
        Min cluster size: 32.00KiB
        Max cluster size: 640.00KiB
    Total disk spread: 60.12GiB
    Total read time: 0 s 139678 us
    Levels: 3
Calculating size of fs tree
    Total size: 48.00KiB
        Inline data: 0.00B
    Total seeks: 2
        Forward seeks: 0
        Backward seeks: 2
        Avg seek len: 62.95MiB
    Total clusters: 1
        Avg cluster size: 0.00B
        Min cluster size: 0.00B
        Max cluster size: 16.00KiB
    Total disk spread: 125.86MiB
    Total read time: 0 s 19675 us
    Levels: 2
[chris@f26s ~]$


I don't think the number of snapshots you have for Docker containers
is the problem. There's this thread (admittedly on SSD) which suggests
decent performance is possible with thousands of containers per day
(100,000 - 200,000 per day but I don't think that's per file system,
I'm actually not sure how many file systems are involved).

https://www.spinics.net/lists/linux-btrfs/msg67308.html



-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Best Practice: Add new device to RAID1 pool
  2017-07-24 20:35 ` Best Practice: Add new device to RAID1 pool Chris Murphy
@ 2017-07-24 20:42   ` Hugo Mills
  2017-07-24 20:55     ` Chris Murphy
  2017-07-25 17:56     ` Cloud Admin
  2017-07-24 21:12   ` waxhead
  1 sibling, 2 replies; 21+ messages in thread
From: Hugo Mills @ 2017-07-24 20:42 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Cloud Admin, Btrfs BTRFS

[-- Attachment #1: Type: text/plain, Size: 5643 bytes --]

On Mon, Jul 24, 2017 at 02:35:05PM -0600, Chris Murphy wrote:
> On Mon, Jul 24, 2017 at 5:27 AM, Cloud Admin <admin@cloud.haefemeier.eu> wrote:
> 
> > I am a little bit confused because the balance command is running since
> > 12 hours and only 3GB of data are touched.
> 
> That's incredibly slow. Something isn't right.
> 
> Using btrfs-debug -b from btrfs-progs, I've selected a few 100% full chunks.
> 
> [156777.077378] f26s.localdomain sudo[13757]:    chris : TTY=pts/2 ;
> PWD=/home/chris ; USER=root ; COMMAND=/sbin/btrfs balance start
> -dvrange=157970071552..159043813376 /
> [156773.328606] f26s.localdomain kernel: BTRFS info (device sda1):
> relocating block group 157970071552 flags data
> [156800.408918] f26s.localdomain kernel: BTRFS info (device sda1):
> found 38952 extents
> [156861.343067] f26s.localdomain kernel: BTRFS info (device sda1):
> found 38951 extents
> 
> That 1GiB chunk with quite a few fragments took 88s. That's 11MB/s.
> Even for a hard drive, that's slow. I've got maybe a dozen snapshots
> on this particular volume and quotas are not enabled. By definition
> all of those extents are sequential. So I'm not sure why it's taking
> so long. Seems almost like a regression somewhere. A nearby chunk with
> ~23k extents only takes 45s to balance. And another chunk with ~32000
> extents took 55s to balance.

   In my experience, it's pretty consistent at about a minute per 1
GiB for data on rotational drives on RAID-1. For metadata, it can go
up to several hours (or more) per 256 MiB chunk, depending on what
kind of metadata it is. With extents shared between lots of files, it
slows down. In my case, with a few hundred snapshots of the same
thing, my system was taking 4h per chunk for the chunks full of the
extent tree.

   Hugo.

> 4.11.10-300.fc26.x86_64
> btrfs-progs-4.11.1-1.fc27.x86_64
> 
> But what you are experiencing is a orders of magnitude worse than what
> I'm experiencing. What kernel and progs are you using?
> 
> Track down debug-btrfs in the root of
> https://github.com/kdave/btrfs-progs  and point it to the mounted
> volume with -b so something like
> sudo btrfs-debug -b /srv/scratch
> 
> Also do you have any output from kernel messages like above "relocated
> block group" and "found extents" how many extents in these bg's that
> have been relocated?
> 
> I don't know if this is a helpful comparison but I'm finding 'btrfs
> inspect-internal tree-stats'.
> 
> [chris@f26s ~]$ sudo btrfs inspect tree-stats /dev/sda1
> WARNING: /dev/sda1 already mounted, results may be inaccurate
> Calculating size of root tree
>     Total size: 64.00KiB
>         Inline data: 0.00B
>     Total seeks: 3
>         Forward seeks: 2
>         Backward seeks: 1
>         Avg seek len: 4.82MiB
>     Total clusters: 1
>         Avg cluster size: 0.00B
>         Min cluster size: 0.00B
>         Max cluster size: 16.00KiB
>     Total disk spread: 6.50MiB
>     Total read time: 0 s 3 us
>     Levels: 2
> Calculating size of extent tree
>     Total size: 63.03MiB
>         Inline data: 0.00B
>     Total seeks: 3613
>         Forward seeks: 1801
>         Backward seeks: 1812
>         Avg seek len: 15.19GiB
>     Seek histogram
>               16384 -      147456:         546 ###
>              180224 -     5554176:         540 ###
>             5718016 -    22200320:         540 ###
>            22265856 -    96534528:         540 ###
>            96616448 - 47356215296:         540 ###
>         47357067264 - 64038076416:         540 ###
>         64038371328 - 64525729792:         346 #
>     Total clusters: 295
>         Avg cluster size: 38.78KiB
>         Min cluster size: 32.00KiB
>         Max cluster size: 128.00KiB
>     Total disk spread: 60.12GiB
>     Total read time: 0 s 1338 us
>     Levels: 3
> Calculating size of csum tree
>     Total size: 67.44MiB
>         Inline data: 0.00B
>     Total seeks: 3368
>         Forward seeks: 2167
>         Backward seeks: 1201
>         Avg seek len: 12.95GiB
>     Seek histogram
>               16384 -       65536:         532 ###
>               98304 -      720896:         504 ###
>              753664 -    37404672:         504 ###
>            38125568 -   215547904:         504 ###
>           216481792 - 47522119680:         504 ###
>         47522430976 - 63503482880:         505 ###
>         63508348928 - 64503119872:         267 #
>     Total clusters: 389
>         Avg cluster size: 54.75KiB
>         Min cluster size: 32.00KiB
>         Max cluster size: 640.00KiB
>     Total disk spread: 60.12GiB
>     Total read time: 0 s 139678 us
>     Levels: 3
> Calculating size of fs tree
>     Total size: 48.00KiB
>         Inline data: 0.00B
>     Total seeks: 2
>         Forward seeks: 0
>         Backward seeks: 2
>         Avg seek len: 62.95MiB
>     Total clusters: 1
>         Avg cluster size: 0.00B
>         Min cluster size: 0.00B
>         Max cluster size: 16.00KiB
>     Total disk spread: 125.86MiB
>     Total read time: 0 s 19675 us
>     Levels: 2
> [chris@f26s ~]$
> 
> 
> I don't think the number of snapshots you have for Docker containers
> is the problem. There's this thread (admittedly on SSD) which suggests
> decent performance is possible with thousands of containers per day
> (100,000 - 200,000 per day but I don't think that's per file system,
> I'm actually not sure how many file systems are involved).
> 
> https://www.spinics.net/lists/linux-btrfs/msg67308.html
> 
> 
> 

-- 
Hugo Mills             | Two things came out of Berkeley in the 1960s: LSD
hugo@... carfax.org.uk | and Unix. This is not a coincidence.
http://carfax.org.uk/  |
PGP: E2AB1DE4          |

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Best Practice: Add new device to RAID1 pool
  2017-07-24 20:42   ` Hugo Mills
@ 2017-07-24 20:55     ` Chris Murphy
  2017-07-24 21:00       ` Hugo Mills
  2017-07-24 21:17       ` Adam Borowski
  2017-07-25 17:56     ` Cloud Admin
  1 sibling, 2 replies; 21+ messages in thread
From: Chris Murphy @ 2017-07-24 20:55 UTC (permalink / raw)
  To: Hugo Mills, Chris Murphy, Cloud Admin, Btrfs BTRFS

On Mon, Jul 24, 2017 at 2:42 PM, Hugo Mills <hugo@carfax.org.uk> wrote:

>
>    In my experience, it's pretty consistent at about a minute per 1
> GiB for data on rotational drives on RAID-1. For metadata, it can go
> up to several hours (or more) per 256 MiB chunk, depending on what
> kind of metadata it is. With extents shared between lots of files, it
> slows down. In my case, with a few hundred snapshots of the same
> thing, my system was taking 4h per chunk for the chunks full of the
> extent tree.

Egads.

Maybe Cloud Admin ought to consider using a filter to just balance the
data chunks across the three devices, and just leave the metadata on
the original two disks?

Maybe

sudo btrfs balance start -dusage=100 <mp>


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Best Practice: Add new device to RAID1 pool
  2017-07-24 20:55     ` Chris Murphy
@ 2017-07-24 21:00       ` Hugo Mills
  2017-07-24 21:17       ` Adam Borowski
  1 sibling, 0 replies; 21+ messages in thread
From: Hugo Mills @ 2017-07-24 21:00 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Cloud Admin, Btrfs BTRFS

[-- Attachment #1: Type: text/plain, Size: 1440 bytes --]

On Mon, Jul 24, 2017 at 02:55:00PM -0600, Chris Murphy wrote:
> On Mon, Jul 24, 2017 at 2:42 PM, Hugo Mills <hugo@carfax.org.uk> wrote:
> 
> >
> >    In my experience, it's pretty consistent at about a minute per 1
> > GiB for data on rotational drives on RAID-1. For metadata, it can go
> > up to several hours (or more) per 256 MiB chunk, depending on what
> > kind of metadata it is. With extents shared between lots of files, it
> > slows down. In my case, with a few hundred snapshots of the same
> > thing, my system was taking 4h per chunk for the chunks full of the
> > extent tree.
> 
> Egads.
> 
> Maybe Cloud Admin ought to consider using a filter to just balance the
> data chunks across the three devices, and just leave the metadata on
> the original two disks?
> 
> Maybe
> 
> sudo btrfs balance start -dusage=100 <mp>

   It's certainly a plausible approach, yes.

   Or just wait it out -- the number of slow chunks is typically very
small. Note that most of the metadata will be csums (which are fast),
and not all of the other metadata chunks are slow ones.

   It would be interesting to know in this case the times of the
chunks that have been balanced to date (grep for the lines with the
chunk IDs in system logs).

   Hugo.

-- 
Hugo Mills             | Two things came out of Berkeley in the 1960s: LSD
hugo@... carfax.org.uk | and Unix. This is not a coincidence.
http://carfax.org.uk/  |
PGP: E2AB1DE4          |

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Best Practice: Add new device to RAID1 pool
  2017-07-24 20:35 ` Best Practice: Add new device to RAID1 pool Chris Murphy
  2017-07-24 20:42   ` Hugo Mills
@ 2017-07-24 21:12   ` waxhead
  2017-07-24 21:20     ` Chris Murphy
  2017-07-25 17:46     ` Cloud Admin
  1 sibling, 2 replies; 21+ messages in thread
From: waxhead @ 2017-07-24 21:12 UTC (permalink / raw)
  To: Chris Murphy, Cloud Admin; +Cc: Btrfs BTRFS



Chris Murphy wrote:
> On Mon, Jul 24, 2017 at 5:27 AM, Cloud Admin <admin@cloud.haefemeier.eu> wrote:
>
>> I am a little bit confused because the balance command is running since
>> 12 hours and only 3GB of data are touched.
> That's incredibly slow. Something isn't right.
>
> Using btrfs-debug -b from btrfs-progs, I've selected a few 100% full chunks.
>
> [156777.077378] f26s.localdomain sudo[13757]:    chris : TTY=pts/2 ;
> PWD=/home/chris ; USER=root ; COMMAND=/sbin/btrfs balance start
> -dvrange=157970071552..159043813376 /
> [156773.328606] f26s.localdomain kernel: BTRFS info (device sda1):
> relocating block group 157970071552 flags data
> [156800.408918] f26s.localdomain kernel: BTRFS info (device sda1):
> found 38952 extents
> [156861.343067] f26s.localdomain kernel: BTRFS info (device sda1):
> found 38951 extents
>
> That 1GiB chunk with quite a few fragments took 88s. That's 11MB/s.
> Even for a hard drive, that's slow. I
This may be a stupid question , but are your pool of butter (or BTRFS 
pool) by any chance hooked up via USB? If this is USB2.0 at 480mitb/s 
then it is about 57MB/s / 4 drives = roughly 14.25 or about 11MB/s if 
you shave off some overhead.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Best Practice: Add new device to RAID1 pool
  2017-07-24 20:55     ` Chris Murphy
  2017-07-24 21:00       ` Hugo Mills
@ 2017-07-24 21:17       ` Adam Borowski
  2017-07-24 23:18         ` Chris Murphy
  1 sibling, 1 reply; 21+ messages in thread
From: Adam Borowski @ 2017-07-24 21:17 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Hugo Mills, Cloud Admin, Btrfs BTRFS

On Mon, Jul 24, 2017 at 02:55:00PM -0600, Chris Murphy wrote:
> Egads.
> 
> Maybe Cloud Admin ought to consider using a filter to just balance the
> data chunks across the three devices, and just leave the metadata on
> the original two disks?

Balancing when adding a new disk isn't that important unless the two old
disks are almost full.

> Maybe
> sudo btrfs balance start -dusage=100 <mp>

Note that this doesn't mean "all data", merely "all data chunks < 100%
full".  It's a strictly-lesser-than comparison.

You'd want "-dusage=101" which is illegal, the right one is "-d".  I used to
believe -dusage=100 does that, myself.


Meow!
-- 
⢀⣴⠾⠻⢶⣦⠀ 
⣾⠁⢠⠒⠀⣿⡁ A dumb species has no way to open a tuna can.
⢿⡄⠘⠷⠚⠋⠀ A smart species invents a can opener.
⠈⠳⣄⠀⠀⠀⠀ A master species delegates.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Best Practice: Add new device to RAID1 pool
  2017-07-24 21:12   ` waxhead
@ 2017-07-24 21:20     ` Chris Murphy
  2017-07-25  2:22       ` Marat Khalili
  2017-07-25 17:46     ` Cloud Admin
  1 sibling, 1 reply; 21+ messages in thread
From: Chris Murphy @ 2017-07-24 21:20 UTC (permalink / raw)
  To: waxhead; +Cc: Chris Murphy, Cloud Admin, Btrfs BTRFS

On Mon, Jul 24, 2017 at 3:12 PM, waxhead <waxhead@dirtcellar.net> wrote:
>
>
> Chris Murphy wrote:
>>
>> On Mon, Jul 24, 2017 at 5:27 AM, Cloud Admin <admin@cloud.haefemeier.eu>
>> wrote:
>>
>>> I am a little bit confused because the balance command is running since
>>> 12 hours and only 3GB of data are touched.
>>
>> That's incredibly slow. Something isn't right.
>>
>> Using btrfs-debug -b from btrfs-progs, I've selected a few 100% full
>> chunks.
>>
>> [156777.077378] f26s.localdomain sudo[13757]:    chris : TTY=pts/2 ;
>> PWD=/home/chris ; USER=root ; COMMAND=/sbin/btrfs balance start
>> -dvrange=157970071552..159043813376 /
>> [156773.328606] f26s.localdomain kernel: BTRFS info (device sda1):
>> relocating block group 157970071552 flags data
>> [156800.408918] f26s.localdomain kernel: BTRFS info (device sda1):
>> found 38952 extents
>> [156861.343067] f26s.localdomain kernel: BTRFS info (device sda1):
>> found 38951 extents
>>
>> That 1GiB chunk with quite a few fragments took 88s. That's 11MB/s.
>> Even for a hard drive, that's slow. I
>
> This may be a stupid question , but are your pool of butter (or BTRFS pool)
> by any chance hooked up via USB? If this is USB2.0 at 480mitb/s then it is
> about 57MB/s / 4 drives = roughly 14.25 or about 11MB/s if you shave off
> some overhead.
>

Nope, USB 3. Typically on scrubs I get 110MB/s that winds down to
60MB/s as it progresses to the slow parts of the disk.


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Best Practice: Add new device to RAID1 pool
  2017-07-24 21:17       ` Adam Borowski
@ 2017-07-24 23:18         ` Chris Murphy
  0 siblings, 0 replies; 21+ messages in thread
From: Chris Murphy @ 2017-07-24 23:18 UTC (permalink / raw)
  To: Adam Borowski; +Cc: Chris Murphy, Hugo Mills, Cloud Admin, Btrfs BTRFS

On Mon, Jul 24, 2017 at 3:17 PM, Adam Borowski <kilobyte@angband.pl> wrote:
> On Mon, Jul 24, 2017 at 02:55:00PM -0600, Chris Murphy wrote:
>> Egads.
>>
>> Maybe Cloud Admin ought to consider using a filter to just balance the
>> data chunks across the three devices, and just leave the metadata on
>> the original two disks?
>
> Balancing when adding a new disk isn't that important unless the two old
> disks are almost full.
>
>> Maybe
>> sudo btrfs balance start -dusage=100 <mp>
>
> Note that this doesn't mean "all data", merely "all data chunks < 100%
> full".  It's a strictly-lesser-than comparison.
>
> You'd want "-dusage=101" which is illegal, the right one is "-d".  I used to
> believe -dusage=100 does that, myself.


Yeah they could even go with 50% because the balance isn't strictly
necessary anyway if the two originals aren't close to full and b.

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Best Practice: Add new device to RAID1 pool
  2017-07-24 21:20     ` Chris Murphy
@ 2017-07-25  2:22       ` Marat Khalili
  2017-07-25  8:13         ` Chris Murphy
  0 siblings, 1 reply; 21+ messages in thread
From: Marat Khalili @ 2017-07-25  2:22 UTC (permalink / raw)
  To: Chris Murphy, waxhead; +Cc: Cloud Admin, Btrfs BTRFS

>> This may be a stupid question , but are your pool of butter (or BTRFS pool) 
>> by any chance hooked up via USB? If this is USB2.0 at 480mitb/s then it is 
>> about 57MB/s / 4 drives = roughly 14.25 or about 11MB/s if you shave off 
>> some overhead.
>
>Nope, USB 3. Typically on scrubs I get 110MB/s that winds down to 
>60MB/s as it progresses to the slow parts of the disk.

It could have degraded to USB2 due to bad connection/loose electrical contacts. You know USB3 needs extra wires, and if it lost some it'd connect (or reconnect) in USB2 mode. I'd check historical kernel messages just in case, and/or unmount and reconnect to be sure.
-- 

With Best Regards,
Marat Khalili

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Best Practice: Add new device to RAID1 pool
  2017-07-25  2:22       ` Marat Khalili
@ 2017-07-25  8:13         ` Chris Murphy
  0 siblings, 0 replies; 21+ messages in thread
From: Chris Murphy @ 2017-07-25  8:13 UTC (permalink / raw)
  To: Marat Khalili; +Cc: Chris Murphy, waxhead, Cloud Admin, Btrfs BTRFS

On Mon, Jul 24, 2017 at 8:22 PM, Marat Khalili <mkh@rqc.ru> wrote:
>>> This may be a stupid question , but are your pool of butter (or BTRFS pool)
>>> by any chance hooked up via USB? If this is USB2.0 at 480mitb/s then it is
>>> about 57MB/s / 4 drives = roughly 14.25 or about 11MB/s if you shave off
>>> some overhead.
>>
>>Nope, USB 3. Typically on scrubs I get 110MB/s that winds down to
>>60MB/s as it progresses to the slow parts of the disk.
>
> It could have degraded to USB2 due to bad connection/loose electrical contacts. You know USB3 needs extra wires, and if it lost some it'd connect (or reconnect) in USB2 mode. I'd check historical kernel messages just in case, and/or unmount and reconnect to be sure.

Actually I'm wrong, the device I was testing for this thread is SATA.
What confused me is the thread is about raid1 and my Btrfs raid1 is
USB 3. But anyway all devices scrub at 100+MB/s but balance is much
slower than even 1/2.


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Best Practice: Add new device to RAID1 pool
  2017-07-24 21:12   ` waxhead
  2017-07-24 21:20     ` Chris Murphy
@ 2017-07-25 17:46     ` Cloud Admin
  1 sibling, 0 replies; 21+ messages in thread
From: Cloud Admin @ 2017-07-25 17:46 UTC (permalink / raw)
  To: Btrfs BTRFS

Am Montag, den 24.07.2017, 23:12 +0200 schrieb waxhead:
> 
> Chris Murphy wrote:
> 
> This may be a stupid question , but are your pool of butter (or
> BTRFS 
> pool) by any chance hooked up via USB? If this is USB2.0 at 
No, it is a SATA array with (currently) four 8TB discs.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Best Practice: Add new device to RAID1 pool
  2017-07-24 20:42   ` Hugo Mills
  2017-07-24 20:55     ` Chris Murphy
@ 2017-07-25 17:56     ` Cloud Admin
  1 sibling, 0 replies; 21+ messages in thread
From: Cloud Admin @ 2017-07-25 17:56 UTC (permalink / raw)
  To: Btrfs BTRFS

Am Montag, den 24.07.2017, 20:42 +0000 schrieb Hugo Mills:
> On Mon, Jul 24, 2017 at 02:35:05PM -0600, Chris Murphy wrote:
> > On Mon, Jul 24, 2017 at 5:27 AM, Cloud Admin <admin@cloud.haefemeie
> > r.eu> wrote:
> > 
> > > I am a little bit confused because the balance command is running
> > > since
> > > 12 hours and only 3GB of data are touched.
> > 
> > That's incredibly slow. Something isn't right.
> > 
> > Using btrfs-debug -b from btrfs-progs, I've selected a few 100%
> > full chunks.
> > 
> > [156777.077378] f26s.localdomain sudo[13757]:    chris : TTY=pts/2
> > ;
> > PWD=/home/chris ; USER=root ; COMMAND=/sbin/btrfs balance start
> > -dvrange=157970071552..159043813376 /
> > [156773.328606] f26s.localdomain kernel: BTRFS info (device sda1):
> > relocating block group 157970071552 flags data
> > [156800.408918] f26s.localdomain kernel: BTRFS info (device sda1):
> > found 38952 extents
> > [156861.343067] f26s.localdomain kernel: BTRFS info (device sda1):
> > found 38951 extents
> > 
> > That 1GiB chunk with quite a few fragments took 88s. That's 11MB/s.
> > Even for a hard drive, that's slow. I've got maybe a dozen
> > snapshots
> > on this particular volume and quotas are not enabled. By definition
> > all of those extents are sequential. So I'm not sure why it's
> > taking
> > so long. Seems almost like a regression somewhere. A nearby chunk
> > with
> > ~23k extents only takes 45s to balance. And another chunk with
> > ~32000
> > extents took 55s to balance.
> 
>    In my experience, it's pretty consistent at about a minute per 1
> GiB for data on rotational drives on RAID-1. For metadata, it can go
> up to several hours (or more) per 256 MiB chunk, depending on what
> kind of metadata it is. With extents shared between lots of files, it
> slows down. In my case, with a few hundred snapshots of the same
> thing, my system was taking 4h per chunk for the chunks full of the
> extent tree.
After disabling quota the balancing is no working faster. After 27h
approx. 1.3TB are done. It has taken around 4h of rearrange the data on
the old three discs the process started to use the new one. Since there
it is processing much faster.

Bye
	Frank

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Best Practice: Add new device to RAID1 pool (Summary)
  2017-07-24 16:40       ` Cloud Admin
@ 2017-07-29 23:04         ` Cloud Admin
  2017-07-31 11:52           ` Austin S. Hemmelgarn
  0 siblings, 1 reply; 21+ messages in thread
From: Cloud Admin @ 2017-07-29 23:04 UTC (permalink / raw)
  To: linux-btrfs

Am Montag, den 24.07.2017, 18:40 +0200 schrieb Cloud Admin:
> Am Montag, den 24.07.2017, 10:25 -0400 schrieb Austin S. Hemmelgarn:
> > On 2017-07-24 10:12, Cloud Admin wrote:
> > > Am Montag, den 24.07.2017, 09:46 -0400 schrieb Austin S.
> > > Hemmelgarn:
> > > > On 2017-07-24 07:27, Cloud Admin wrote:
> > > > > Hi,
> > > > > I have a multi-device pool (three discs) as RAID1. Now I want
> > > > > to
> > > > > add a
> > > > > new disc to increase the pool. I followed the description on
> > > > > https:
> > > > > //bt
> > > > > rfs.wiki.kernel.org/index.php/Using_Btrfs_with_Multiple_Devic
> > > > > es
> > > > > and
> > > > > used 'btrfs add <device> <btrfs path>'. After that I called a
> > > > > balance
> > > > > for rebalancing the RAID1 using 'btrfs balance start <btrfs
> > > > > path>'.
> > > > > Is that anything or should I need to call a resize (for
> > > > > example) or
> > > > > anything else? Or do I need to specify filter/profile
> > > > > parameters
> > > > > for
> > > > > balancing?
> > > > > I am a little bit confused because the balance command is
> > > > > running
> > > > > since
> > > > > 12 hours and only 3GB of data are touched. This would mean
> > > > > the
> > > > > whole
> > > > > balance process (new disc has 8TB) would run a long, long
> > > > > time...
> > > > > and
> > > > > is using one cpu by 100%.
> > > > 
> > > > Based on what you're saying, it sounds like you've either run
> > > > into a
> > > > bug, or have a huge number of snapshots on this filesystem.
> > > 
> > > It depends what you define as huge. The call of 'btrfs sub list
> > > <btrfs
> > > path>' returns a list of 255 subvolume.
> > 
> > OK, this isn't horrible, especially if most of them aren't
> > snapshots 
> > (it's cross-subvolume reflinks that are most of the issue when it
> > comes 
> > to snapshots, not the fact that they're subvolumes).
> > > I think this is not too huge. The most of this subvolumes was
> > > created
> > > using docker itself. I cancel the balance (this will take awhile)
> > > and will try to delete such of these subvolumes/snapshots.
> > > What can I do more?
> > 
> > As Roman mentioned in his reply, it may also be qgroup related.  If
> > you run:
> > btrfs quota disable
> 
> It seems quota was one part of it. Thanks for the tip. I disabled and
> started balance new.
> Now approx. each 5 min. one chunk will be relocated. But if I take
> the
> reported 10860 chunks and calc. the time it will take ~37 days to
> finish... So, it seems I have to investigate more time into figure
> out
> the subvolume / snapshots structure created by docker.
> A first deeper look shows, there is a subvolume with a snapshot,
> which
> has itself a snapshot, and so forth.
> > 
> > 
Now, the balance process finished after 127h the new disc is in the
pool... Not so long as expected but in my opinion long enough. Quota
seems one big driver in my case. What I could see over the time at the
beginning many extends was relocated ignoring the new disc. Properly it
could be a good idea to rebalance using filter (like -dusage=30 for
example) before add the new disc to decrease the time. 
But only theory. It will try to keep it in my mind for the next time.

Thanks all for your tips, ideas and time!
	Frank


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Best Practice: Add new device to RAID1 pool (Summary)
  2017-07-29 23:04         ` Best Practice: Add new device to RAID1 pool (Summary) Cloud Admin
@ 2017-07-31 11:52           ` Austin S. Hemmelgarn
  0 siblings, 0 replies; 21+ messages in thread
From: Austin S. Hemmelgarn @ 2017-07-31 11:52 UTC (permalink / raw)
  To: Cloud Admin, linux-btrfs

On 2017-07-29 19:04, Cloud Admin wrote:
> Am Montag, den 24.07.2017, 18:40 +0200 schrieb Cloud Admin:
>> Am Montag, den 24.07.2017, 10:25 -0400 schrieb Austin S. Hemmelgarn:
>>> On 2017-07-24 10:12, Cloud Admin wrote:
>>>> Am Montag, den 24.07.2017, 09:46 -0400 schrieb Austin S.
>>>> Hemmelgarn:
>>>>> On 2017-07-24 07:27, Cloud Admin wrote:
>>>>>> Hi,
>>>>>> I have a multi-device pool (three discs) as RAID1. Now I want
>>>>>> to
>>>>>> add a
>>>>>> new disc to increase the pool. I followed the description on
>>>>>> https:
>>>>>> //bt
>>>>>> rfs.wiki.kernel.org/index.php/Using_Btrfs_with_Multiple_Devic
>>>>>> es
>>>>>> and
>>>>>> used 'btrfs add <device> <btrfs path>'. After that I called a
>>>>>> balance
>>>>>> for rebalancing the RAID1 using 'btrfs balance start <btrfs
>>>>>> path>'.
>>>>>> Is that anything or should I need to call a resize (for
>>>>>> example) or
>>>>>> anything else? Or do I need to specify filter/profile
>>>>>> parameters
>>>>>> for
>>>>>> balancing?
>>>>>> I am a little bit confused because the balance command is
>>>>>> running
>>>>>> since
>>>>>> 12 hours and only 3GB of data are touched. This would mean
>>>>>> the
>>>>>> whole
>>>>>> balance process (new disc has 8TB) would run a long, long
>>>>>> time...
>>>>>> and
>>>>>> is using one cpu by 100%.
>>>>>
>>>>> Based on what you're saying, it sounds like you've either run
>>>>> into a
>>>>> bug, or have a huge number of snapshots on this filesystem.
>>>>
>>>> It depends what you define as huge. The call of 'btrfs sub list
>>>> <btrfs
>>>> path>' returns a list of 255 subvolume.
>>>
>>> OK, this isn't horrible, especially if most of them aren't
>>> snapshots
>>> (it's cross-subvolume reflinks that are most of the issue when it
>>> comes
>>> to snapshots, not the fact that they're subvolumes).
>>>> I think this is not too huge. The most of this subvolumes was
>>>> created
>>>> using docker itself. I cancel the balance (this will take awhile)
>>>> and will try to delete such of these subvolumes/snapshots.
>>>> What can I do more?
>>>
>>> As Roman mentioned in his reply, it may also be qgroup related.  If
>>> you run:
>>> btrfs quota disable
>>
>> It seems quota was one part of it. Thanks for the tip. I disabled and
>> started balance new.
>> Now approx. each 5 min. one chunk will be relocated. But if I take
>> the
>> reported 10860 chunks and calc. the time it will take ~37 days to
>> finish... So, it seems I have to investigate more time into figure
>> out
>> the subvolume / snapshots structure created by docker.
>> A first deeper look shows, there is a subvolume with a snapshot,
>> which
>> has itself a snapshot, and so forth.
>>>
>>>
> Now, the balance process finished after 127h the new disc is in the
> pool... Not so long as expected but in my opinion long enough. Quota
> seems one big driver in my case. What I could see over the time at the
> beginning many extends was relocated ignoring the new disc. Properly it
> could be a good idea to rebalance using filter (like -dusage=30 for
> example) before add the new disc to decrease the time.
> But only theory. It will try to keep it in my mind for the next time.
FWIW, in my own experience, I've found that this does help, although I 
usually use '-dusage=50 -musage=50'.  The same goes for converting to a 
different profile, as in both cases, balance seems to naively assume 
that there is only one partially filled chunk (optimal behavior differs 
between that and the realistic case).

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2017-07-31 11:52 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-07-24 11:27 Best Practice: Add new device to RAID1 pool Cloud Admin
2017-07-24 13:46 ` Austin S. Hemmelgarn
2017-07-24 14:08   ` Roman Mamedov
2017-07-24 16:42     ` Cloud Admin
2017-07-24 14:12   ` Cloud Admin
2017-07-24 14:25     ` Austin S. Hemmelgarn
2017-07-24 16:40       ` Cloud Admin
2017-07-29 23:04         ` Best Practice: Add new device to RAID1 pool (Summary) Cloud Admin
2017-07-31 11:52           ` Austin S. Hemmelgarn
2017-07-24 20:35 ` Best Practice: Add new device to RAID1 pool Chris Murphy
2017-07-24 20:42   ` Hugo Mills
2017-07-24 20:55     ` Chris Murphy
2017-07-24 21:00       ` Hugo Mills
2017-07-24 21:17       ` Adam Borowski
2017-07-24 23:18         ` Chris Murphy
2017-07-25 17:56     ` Cloud Admin
2017-07-24 21:12   ` waxhead
2017-07-24 21:20     ` Chris Murphy
2017-07-25  2:22       ` Marat Khalili
2017-07-25  8:13         ` Chris Murphy
2017-07-25 17:46     ` Cloud Admin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).