linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* qgroup code slowing down rebalance
@ 2016-03-16 20:53 Dave Hansen
  2016-03-17  1:36 ` Qu Wenruo
  0 siblings, 1 reply; 6+ messages in thread
From: Dave Hansen @ 2016-03-16 20:53 UTC (permalink / raw)
  To: linux-btrfs, Qu Wenruo, Chris Mason

I have a medium-sized multi-device btrfs filesystem (4 disks, 16TB
total) running under 4.5.0-rc5.  I recently added a disk and needed to
rebalance.  I started a rebalance operation three days ago.  It was on
the order of 20% done after those three days. :)

During this rebalance, the disks were pretty lightly used.  I would see
a small burst of tens of MB/s, then it would go back to no activity for
a few minutes, small burst, no activity, etc...  During the quiet times
(for the disk) one processor would be pegged inside the kernel and would
have virtually no I/O wait time.  Also during this time, the filesystem
was pretty unbearably slow.  An ls of a small directory would hang for
minutes.

A perf profile shows 92% of the cpu time is being spend in
btrfs_find_all_roots(), called under this call path:

	btrfs_commit_transaction
	 -> btrfs_qgroup_prepare_account_extents
	   -> btrfs_find_all_roots

So I tried disabling quotas by doing:

	btrfs quota disable /mnt/foo

which took a few minutes to complete, but once it did, the disks went
back up to doing ~200MB/s, the kernel time went down to ~20%, and the
system now has lots of I/O wait time.  It looks to be behaving nicely.

Is this expected?  From my perspective, it makes quotas pretty much
unusable at least during a rebalance.  I have a full 'perf record'
profile with call graphs if it would be helpful.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: qgroup code slowing down rebalance
  2016-03-16 20:53 qgroup code slowing down rebalance Dave Hansen
@ 2016-03-17  1:36 ` Qu Wenruo
  2016-03-17 16:36   ` Dave Hansen
  0 siblings, 1 reply; 6+ messages in thread
From: Qu Wenruo @ 2016-03-17  1:36 UTC (permalink / raw)
  To: Dave Hansen, linux-btrfs, Chris Mason



Dave Hansen wrote on 2016/03/16 13:53 -0700:
> I have a medium-sized multi-device btrfs filesystem (4 disks, 16TB
> total) running under 4.5.0-rc5.  I recently added a disk and needed to
> rebalance.  I started a rebalance operation three days ago.  It was on
> the order of 20% done after those three days. :)
>
> During this rebalance, the disks were pretty lightly used.  I would see
> a small burst of tens of MB/s, then it would go back to no activity for
> a few minutes, small burst, no activity, etc...  During the quiet times
> (for the disk) one processor would be pegged inside the kernel and would
> have virtually no I/O wait time.  Also during this time, the filesystem
> was pretty unbearably slow.  An ls of a small directory would hang for
> minutes.
>
> A perf profile shows 92% of the cpu time is being spend in
> btrfs_find_all_roots(), called under this call path:
>
> 	btrfs_commit_transaction
> 	 -> btrfs_qgroup_prepare_account_extents
> 	   -> btrfs_find_all_roots
>
> So I tried disabling quotas by doing:
>
> 	btrfs quota disable /mnt/foo
>
> which took a few minutes to complete, but once it did, the disks went
> back up to doing ~200MB/s, the kernel time went down to ~20%, and the
> system now has lots of I/O wait time.  It looks to be behaving nicely.
>
> Is this expected?  From my perspective, it makes quotas pretty much
> unusable at least during a rebalance.  I have a full 'perf record'
> profile with call graphs if it would be helpful.
>
>
Thanks for the report.

Again balance, the devil is again(always) in the balance.
To be honest, balance itself is already complicated enough, and not a 
friendly neighborhood for a lot of function.
(Yeah, a lot of dedupe bugs are and can only be triggered by balance, 
and I can't hate it any more)


Perf record profile will help a lot.
Please upload it if it's OK for you.

Also, the following data would help a lot:
1) btrfs fi df output
    To determine the metadata/data ratio
    Balancing metadata should be quite slow.

2) btrfs subvolume list output
    To determine how many tree blocks are shared against each other
    More shared tree blocks, slower quota routing is.

    Feel free to mask the output to avoid information leaking.

3) perf record for balancing metadata and data respectively
    Although this is optional. Just to prove some assumption.

btrfs_find_all_roots() is quite a slow operation, and in its worst case
(tree blocks are shared by a lot of trees) it may be near O(2^n).

For normal operation, it's pretty hard to trigger so many extents 
creation/deletion/reference.
Even such case happens, it will cause much much much more IO, making the 
most time waiting IO other than doing quota accounting.

But for balance, especially for metadata balancing, IO is small but 
amount of extents is very high, making most the time consumed by 
find_all_roots().


I can try to make balance to bypass quota routine, but I'm not sure if 
such operation will open another hole to make quota crazy again, and 
only much much much more tests can prove it. :(

Thanks,
Qu



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: qgroup code slowing down rebalance
  2016-03-17  1:36 ` Qu Wenruo
@ 2016-03-17 16:36   ` Dave Hansen
  2016-03-18  1:02     ` Qu Wenruo
  0 siblings, 1 reply; 6+ messages in thread
From: Dave Hansen @ 2016-03-17 16:36 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs, Chris Mason

On 03/16/2016 06:36 PM, Qu Wenruo wrote:
> Dave Hansen wrote on 2016/03/16 13:53 -0700:
>> I have a medium-sized multi-device btrfs filesystem (4 disks, 16TB
>> total) running under 4.5.0-rc5.  I recently added a disk and needed to
>> rebalance.  I started a rebalance operation three days ago.  It was on
>> the order of 20% done after those three days. :)
...
> Perf record profile will help a lot.
> Please upload it if it's OK for you.

I'll send it privately.

But, I do see basically the same behavior when balancing data and
metadata.  I profiled both.

> Also, the following data would help a lot:
> 1) btrfs fi df output
>    To determine the metadata/data ratio
>    Balancing metadata should be quite slow.

Data, RAID1: total=4.53TiB, used=4.53TiB
System, RAID1: total=32.00MiB, used=720.00KiB
Metadata, RAID1: total=17.00GiB, used=15.77GiB
GlobalReserve, single: total=512.00MiB, used=0.00B


> 2) btrfs subvolume list output
>    To determine how many tree blocks are shared against each other
>    More shared tree blocks, slower quota routing is.
> 
>    Feel free to mask the output to avoid information leaking.

Here you go (pasted at the end).  You can pretty clearly see that I'm
using this volume for incremental backups.  My plan was to keep the old
snapshots around until the filesystem fills up.

> 3) perf record for balancing metadata and data respectively
>    Although this is optional. Just to prove some assumption.
> 
> btrfs_find_all_roots() is quite a slow operation, and in its worst case
> (tree blocks are shared by a lot of trees) it may be near O(2^n).

Yikes! But, that does look to be consistent with what I'm seeing.

>> ID 15183 gen 207620 top level 5 path blackbird_backups.enc
>> ID 15547 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1455779116
>> ID 15548 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1455779185
>> ID 15559 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1455814382
>> ID 15560 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1455814540
>> ID 15561 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1455814739
>> ID 15562 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1455814783
>> ID 15563 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1455815180
>> ID 15564 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1455815259
>> ID 15565 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1455815314
>> ID 15566 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1455815948
>> ID 15567 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1455816104
>> ID 15568 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1455816127
>> ID 15569 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1455816213
>> ID 15570 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1455816305
>> ID 15571 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1455816591
>> ID 15572 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1455816622
>> ID 15573 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1455816632
>> ID 15574 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1455816638
>> ID 15575 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1455816837
>> ID 15576 gen 207620 top level 5 path home-backup-on-btrfs.enc
>> ID 16634 gen 207620 top level 5 path homes-backup-nimitz-o2.enc
>> ID 17389 gen 207620 top level 5 path snapshots/root/1456271568
>> ID 17390 gen 207620 top level 5 path snapshots/root/1456272221
>> ID 17391 gen 207620 top level 5 path snapshots/root/1456416369
>> ID 17392 gen 207620 top level 5 path snapshots/root/1456435609
>> ID 17393 gen 207620 top level 5 path snapshots/root/1456539873
>> ID 17394 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1456553298
>> ID 17395 gen 207620 top level 5 path snapshots/root/1456553448
>> ID 17396 gen 207620 top level 5 path snapshots/home-backup-on-btrfs.enc/1456553460
>> ID 17397 gen 207620 top level 5 path snapshots/home-backup-on-btrfs.enc/1456553461
>> ID 17398 gen 207620 top level 5 path snapshots/home-backup-on-btrfs.enc/1456602362
>> ID 17399 gen 207620 top level 5 path snapshots/home-backup-on-btrfs.enc/1456602405
>> ID 17400 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1456602405
>> ID 17401 gen 207620 top level 5 path snapshots/home-backup-on-btrfs.enc/1456628682
>> ID 17402 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1456628682
>> ID 17403 gen 207620 top level 5 path snapshots/root/1456628682
>> ID 17404 gen 207620 top level 5 path snapshots/home-backup-on-btrfs.enc/1456675846
>> ID 17405 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1456675846
>> ID 17406 gen 207620 top level 5 path snapshots/root/1456675846
>> ID 17407 gen 207620 top level 5 path snapshots/home-backup-on-btrfs.enc/1456713915
>> ID 17408 gen 207620 top level 5 path snapshots/home-backup-on-btrfs.enc/1456720748
>> ID 17409 gen 207620 top level 5 path snapshots/home-backup-on-btrfs.enc/1456722793
>> ID 17410 gen 207620 top level 5 path snapshots/home-backup-on-btrfs.enc/1456722812
>> ID 17411 gen 207620 top level 5 path snapshots/root/1456722812
>> ID 17412 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1456722812
>> ID 17413 gen 207620 top level 5 path snapshots/home-backup-on-btrfs.enc/1456731271
>> ID 17414 gen 207620 top level 5 path snapshots/root/1456731271
>> ID 17415 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1456731271
>> ID 17416 gen 207620 top level 5 path snapshots/home-backup-on-btrfs.enc/1456838946
>> ID 17417 gen 207620 top level 5 path snapshots/root/1456838946
>> ID 17418 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1456838946
>> ID 17419 gen 207620 top level 5 path snapshots/root/1457127404
>> ID 17420 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1457127404
>> ID 17421 gen 207620 top level 5 path snapshots/home-backup-on-btrfs.enc/1457127404
>> ID 17422 gen 207620 top level 5 path snapshots/root/1457138948
>> ID 17423 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1457138948
>> ID 17425 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1457451483
>> ID 17426 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1457808380




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: qgroup code slowing down rebalance
  2016-03-17 16:36   ` Dave Hansen
@ 2016-03-18  1:02     ` Qu Wenruo
  2016-03-18 16:33       ` Dave Hansen
  0 siblings, 1 reply; 6+ messages in thread
From: Qu Wenruo @ 2016-03-18  1:02 UTC (permalink / raw)
  To: Dave Hansen, linux-btrfs, Chris Mason



Dave Hansen wrote on 2016/03/17 09:36 -0700:
> On 03/16/2016 06:36 PM, Qu Wenruo wrote:
>> Dave Hansen wrote on 2016/03/16 13:53 -0700:
>>> I have a medium-sized multi-device btrfs filesystem (4 disks, 16TB
>>> total) running under 4.5.0-rc5.  I recently added a disk and needed to
>>> rebalance.  I started a rebalance operation three days ago.  It was on
>>> the order of 20% done after those three days. :)
> ...
>> Perf record profile will help a lot.
>> Please upload it if it's OK for you.
>
> I'll send it privately.
>
> But, I do see basically the same behavior when balancing data and
> metadata.  I profiled both.
>
>> Also, the following data would help a lot:
>> 1) btrfs fi df output
>>     To determine the metadata/data ratio
>>     Balancing metadata should be quite slow.
>
> Data, RAID1: total=4.53TiB, used=4.53TiB
> System, RAID1: total=32.00MiB, used=720.00KiB
> Metadata, RAID1: total=17.00GiB, used=15.77GiB
> GlobalReserve, single: total=512.00MiB, used=0.00B

Considering the size and the amount of metadata, even doing a quota 
rescan will be quite slowing.

Would you please try to do a quota rescan and see the CPU/IO usage?
>
>
>> 2) btrfs subvolume list output
>>     To determine how many tree blocks are shared against each other
>>     More shared tree blocks, slower quota routing is.
>>
>>     Feel free to mask the output to avoid information leaking.
>
> Here you go (pasted at the end).  You can pretty clearly see that I'm
> using this volume for incremental backups.  My plan was to keep the old
> snapshots around until the filesystem fills up.

Oh, find_all_roots() nightmare, just as mentioned, it will slow quota 
rescan too.

>
>> 3) perf record for balancing metadata and data respectively
>>     Although this is optional. Just to prove some assumption.
>>
>> btrfs_find_all_roots() is quite a slow operation, and in its worst case
>> (tree blocks are shared by a lot of trees) it may be near O(2^n).
>
> Yikes! But, that does look to be consistent with what I'm seeing.

So, you mean time consumed by find_all_roots() is consistent?
Then it must be quite slow for almost all tree blocks...

Anyway, I'd better try to allow balance to by-pass quota accounting.

Thanks,
Qu
>
>>> ID 15183 gen 207620 top level 5 path blackbird_backups.enc
>>> ID 15547 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1455779116
>>> ID 15548 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1455779185
>>> ID 15559 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1455814382
>>> ID 15560 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1455814540
>>> ID 15561 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1455814739
>>> ID 15562 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1455814783
>>> ID 15563 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1455815180
>>> ID 15564 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1455815259
>>> ID 15565 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1455815314
>>> ID 15566 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1455815948
>>> ID 15567 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1455816104
>>> ID 15568 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1455816127
>>> ID 15569 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1455816213
>>> ID 15570 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1455816305
>>> ID 15571 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1455816591
>>> ID 15572 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1455816622
>>> ID 15573 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1455816632
>>> ID 15574 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1455816638
>>> ID 15575 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1455816837
>>> ID 15576 gen 207620 top level 5 path home-backup-on-btrfs.enc
>>> ID 16634 gen 207620 top level 5 path homes-backup-nimitz-o2.enc
>>> ID 17389 gen 207620 top level 5 path snapshots/root/1456271568
>>> ID 17390 gen 207620 top level 5 path snapshots/root/1456272221
>>> ID 17391 gen 207620 top level 5 path snapshots/root/1456416369
>>> ID 17392 gen 207620 top level 5 path snapshots/root/1456435609
>>> ID 17393 gen 207620 top level 5 path snapshots/root/1456539873
>>> ID 17394 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1456553298
>>> ID 17395 gen 207620 top level 5 path snapshots/root/1456553448
>>> ID 17396 gen 207620 top level 5 path snapshots/home-backup-on-btrfs.enc/1456553460
>>> ID 17397 gen 207620 top level 5 path snapshots/home-backup-on-btrfs.enc/1456553461
>>> ID 17398 gen 207620 top level 5 path snapshots/home-backup-on-btrfs.enc/1456602362
>>> ID 17399 gen 207620 top level 5 path snapshots/home-backup-on-btrfs.enc/1456602405
>>> ID 17400 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1456602405
>>> ID 17401 gen 207620 top level 5 path snapshots/home-backup-on-btrfs.enc/1456628682
>>> ID 17402 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1456628682
>>> ID 17403 gen 207620 top level 5 path snapshots/root/1456628682
>>> ID 17404 gen 207620 top level 5 path snapshots/home-backup-on-btrfs.enc/1456675846
>>> ID 17405 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1456675846
>>> ID 17406 gen 207620 top level 5 path snapshots/root/1456675846
>>> ID 17407 gen 207620 top level 5 path snapshots/home-backup-on-btrfs.enc/1456713915
>>> ID 17408 gen 207620 top level 5 path snapshots/home-backup-on-btrfs.enc/1456720748
>>> ID 17409 gen 207620 top level 5 path snapshots/home-backup-on-btrfs.enc/1456722793
>>> ID 17410 gen 207620 top level 5 path snapshots/home-backup-on-btrfs.enc/1456722812
>>> ID 17411 gen 207620 top level 5 path snapshots/root/1456722812
>>> ID 17412 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1456722812
>>> ID 17413 gen 207620 top level 5 path snapshots/home-backup-on-btrfs.enc/1456731271
>>> ID 17414 gen 207620 top level 5 path snapshots/root/1456731271
>>> ID 17415 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1456731271
>>> ID 17416 gen 207620 top level 5 path snapshots/home-backup-on-btrfs.enc/1456838946
>>> ID 17417 gen 207620 top level 5 path snapshots/root/1456838946
>>> ID 17418 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1456838946
>>> ID 17419 gen 207620 top level 5 path snapshots/root/1457127404
>>> ID 17420 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1457127404
>>> ID 17421 gen 207620 top level 5 path snapshots/home-backup-on-btrfs.enc/1457127404
>>> ID 17422 gen 207620 top level 5 path snapshots/root/1457138948
>>> ID 17423 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1457138948
>>> ID 17425 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1457451483
>>> ID 17426 gen 207620 top level 5 path snapshots/blackbird_backups.enc/1457808380
>
>
>
>
>



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: qgroup code slowing down rebalance
  2016-03-18  1:02     ` Qu Wenruo
@ 2016-03-18 16:33       ` Dave Hansen
  2016-03-21  1:44         ` Qu Wenruo
  0 siblings, 1 reply; 6+ messages in thread
From: Dave Hansen @ 2016-03-18 16:33 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs, Chris Mason

On 03/17/2016 06:02 PM, Qu Wenruo wrote:
> Dave Hansen wrote on 2016/03/17 09:36 -0700:
>> On 03/16/2016 06:36 PM, Qu Wenruo wrote:
>>> Dave Hansen wrote on 2016/03/16 13:53 -0700:
>>>> I have a medium-sized multi-device btrfs filesystem (4 disks, 16TB
>>>> total) running under 4.5.0-rc5.  I recently added a disk and needed to
>>>> rebalance.  I started a rebalance operation three days ago.  It was on
>>>> the order of 20% done after those three days. :)
>> ...
>> Data, RAID1: total=4.53TiB, used=4.53TiB
>> System, RAID1: total=32.00MiB, used=720.00KiB
>> Metadata, RAID1: total=17.00GiB, used=15.77GiB
>> GlobalReserve, single: total=512.00MiB, used=0.00B
> 
> Considering the size and the amount of metadata, even doing a quota
> rescan will be quite slowing.
> 
> Would you please try to do a quota rescan and see the CPU/IO usage?

I did a quota rescan.  It uses about 80% of one CPU core, but also has
some I/O wait time and pulls 1-20MB/s of data off the disk (the balance
with quotas on was completely CPU-bound, and had very low I/O rates).

It would seem that the "quota rescan" *does* have the same issue as the
balance with quotas on, but to a much smaller extent than what I saw
with the "balance" operation.

I have a full profile recorded from the "quota rescan", but the most
relevant parts are pasted below.  Basically btrfs_search_slot() and
radix tree lookups are eating all the CPU time, but they're still doing
enough I/O to see _some_ idle time on the processor.

>     74.55%     3.10%  kworker/u8:0     [btrfs]                      [k] find_parent_nodes                       
>                |
>                ---find_parent_nodes
>                   |          
>                   |--99.95%-- __btrfs_find_all_roots
>                   |          btrfs_find_all_roots
>                   |          btrfs_qgroup_rescan_worker
>                   |          normal_work_helper
>                   |          btrfs_qgroup_rescan_helper
>                   |          process_one_work
>                   |          worker_thread
>                   |          kthread
>                   |          ret_from_fork
>                    --0.05%-- [...]
> 
>     32.14%     4.16%  kworker/u8:0     [btrfs]                      [k] btrfs_search_slot                       
>                |
>                ---btrfs_search_slot
>                   |          
>                   |--87.90%-- find_parent_nodes
>                   |          __btrfs_find_all_roots
>                   |          btrfs_find_all_roots
>                   |          btrfs_qgroup_rescan_worker
>                   |          normal_work_helper
>                   |          btrfs_qgroup_rescan_helper
>                   |          process_one_work
>                   |          worker_thread
>                   |          kthread
>                   |          ret_from_fork
>                   |          
>                   |--11.70%-- btrfs_search_old_slot
>                   |          __resolve_indirect_refs
>                   |          find_parent_nodes
>                   |          __btrfs_find_all_roots
>                   |          btrfs_find_all_roots
>                   |          btrfs_qgroup_rescan_worker
>                   |          normal_work_helper
>                   |          btrfs_qgroup_rescan_helper
>                   |          process_one_work
>                   |          worker_thread
>                   |          kthread
>                   |          ret_from_fork
>                    --0.39%-- [...]
> 




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: qgroup code slowing down rebalance
  2016-03-18 16:33       ` Dave Hansen
@ 2016-03-21  1:44         ` Qu Wenruo
  0 siblings, 0 replies; 6+ messages in thread
From: Qu Wenruo @ 2016-03-21  1:44 UTC (permalink / raw)
  To: Dave Hansen, linux-btrfs, Chris Mason



Dave Hansen wrote on 2016/03/18 09:33 -0700:
> On 03/17/2016 06:02 PM, Qu Wenruo wrote:
>> Dave Hansen wrote on 2016/03/17 09:36 -0700:
>>> On 03/16/2016 06:36 PM, Qu Wenruo wrote:
>>>> Dave Hansen wrote on 2016/03/16 13:53 -0700:
>>>>> I have a medium-sized multi-device btrfs filesystem (4 disks, 16TB
>>>>> total) running under 4.5.0-rc5.  I recently added a disk and needed to
>>>>> rebalance.  I started a rebalance operation three days ago.  It was on
>>>>> the order of 20% done after those three days. :)
>>> ...
>>> Data, RAID1: total=4.53TiB, used=4.53TiB
>>> System, RAID1: total=32.00MiB, used=720.00KiB
>>> Metadata, RAID1: total=17.00GiB, used=15.77GiB
>>> GlobalReserve, single: total=512.00MiB, used=0.00B
>>
>> Considering the size and the amount of metadata, even doing a quota
>> rescan will be quite slowing.
>>
>> Would you please try to do a quota rescan and see the CPU/IO usage?
>
> I did a quota rescan.  It uses about 80% of one CPU core, but also has
> some I/O wait time and pulls 1-20MB/s of data off the disk (the balance
> with quotas on was completely CPU-bound, and had very low I/O rates).
>
> It would seem that the "quota rescan" *does* have the same issue as the
> balance with quotas on, but to a much smaller extent than what I saw
> with the "balance" operation.

This is quite expected. Most CPU would be consumed by find_all_roots(), 
as your subvolume layout is the just the worst case for find_all_roots().

That's to say, remove unneeded snapshots and keep them under a limited 
amount would be the best case.
Btrfs snapshot is superfast to create, but the design implies a lot of 
overhead for some minor operation, especially for backref lookup.
And unfortunately, quota relies heavily on it.


The only difference that I doesn't expect is IO. As I expected rescan 
would be much the same with balance, with little IO.
Did you run rescan/balance after dropping all the caches?


Although for balance, I would add some patch to make them by-pass quota 
accounting as them seems to be OK.
But for rescan case, AFAIK that's the case.


BTW although it's quite risky, I hope you can run some old kernels and 
see the performance difference of rescan and balance.
For old kernels, I mean 4.1 which is the latest kernel that doesn't use 
the new quota framework.

I'm very interesting to see if the old but wrong code would have a 
better or worse performance on balance or rescan.

Thank you very much for all this quota related reports.
Qu


>
> I have a full profile recorded from the "quota rescan", but the most
> relevant parts are pasted below.  Basically btrfs_search_slot() and
> radix tree lookups are eating all the CPU time, but they're still doing
> enough I/O to see _some_ idle time on the processor.
>
>>      74.55%     3.10%  kworker/u8:0     [btrfs]                      [k] find_parent_nodes
>>                 |
>>                 ---find_parent_nodes
>>                    |
>>                    |--99.95%-- __btrfs_find_all_roots
>>                    |          btrfs_find_all_roots
>>                    |          btrfs_qgroup_rescan_worker
>>                    |          normal_work_helper
>>                    |          btrfs_qgroup_rescan_helper
>>                    |          process_one_work
>>                    |          worker_thread
>>                    |          kthread
>>                    |          ret_from_fork
>>                     --0.05%-- [...]
>>
>>      32.14%     4.16%  kworker/u8:0     [btrfs]                      [k] btrfs_search_slot
>>                 |
>>                 ---btrfs_search_slot
>>                    |
>>                    |--87.90%-- find_parent_nodes
>>                    |          __btrfs_find_all_roots
>>                    |          btrfs_find_all_roots
>>                    |          btrfs_qgroup_rescan_worker
>>                    |          normal_work_helper
>>                    |          btrfs_qgroup_rescan_helper
>>                    |          process_one_work
>>                    |          worker_thread
>>                    |          kthread
>>                    |          ret_from_fork
>>                    |
>>                    |--11.70%-- btrfs_search_old_slot
>>                    |          __resolve_indirect_refs
>>                    |          find_parent_nodes
>>                    |          __btrfs_find_all_roots
>>                    |          btrfs_find_all_roots
>>                    |          btrfs_qgroup_rescan_worker
>>                    |          normal_work_helper
>>                    |          btrfs_qgroup_rescan_helper
>>                    |          process_one_work
>>                    |          worker_thread
>>                    |          kthread
>>                    |          ret_from_fork
>>                     --0.39%-- [...]
>>
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-03-21  1:44 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-03-16 20:53 qgroup code slowing down rebalance Dave Hansen
2016-03-17  1:36 ` Qu Wenruo
2016-03-17 16:36   ` Dave Hansen
2016-03-18  1:02     ` Qu Wenruo
2016-03-18 16:33       ` Dave Hansen
2016-03-21  1:44         ` Qu Wenruo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).