From: Qu Wenruo <quwenruo@cn.fujitsu.com>
To: Dave Hansen <dave@sr71.net>, <linux-btrfs@vger.kernel.org>,
	Chris Mason <clm@fb.com>
Subject: Re: qgroup code slowing down rebalance
Date: Thu, 17 Mar 2016 09:36:15 +0800	[thread overview]
Message-ID: <56EA0A0F.8070008@cn.fujitsu.com> (raw)
In-Reply-To: <56E9C7BB.7060509@sr71.net>
Dave Hansen wrote on 2016/03/16 13:53 -0700:
> I have a medium-sized multi-device btrfs filesystem (4 disks, 16TB
> total) running under 4.5.0-rc5.  I recently added a disk and needed to
> rebalance.  I started a rebalance operation three days ago.  It was on
> the order of 20% done after those three days. :)
>
> During this rebalance, the disks were pretty lightly used.  I would see
> a small burst of tens of MB/s, then it would go back to no activity for
> a few minutes, small burst, no activity, etc...  During the quiet times
> (for the disk) one processor would be pegged inside the kernel and would
> have virtually no I/O wait time.  Also during this time, the filesystem
> was pretty unbearably slow.  An ls of a small directory would hang for
> minutes.
>
> A perf profile shows 92% of the cpu time is being spend in
> btrfs_find_all_roots(), called under this call path:
>
> 	btrfs_commit_transaction
> 	 -> btrfs_qgroup_prepare_account_extents
> 	   -> btrfs_find_all_roots
>
> So I tried disabling quotas by doing:
>
> 	btrfs quota disable /mnt/foo
>
> which took a few minutes to complete, but once it did, the disks went
> back up to doing ~200MB/s, the kernel time went down to ~20%, and the
> system now has lots of I/O wait time.  It looks to be behaving nicely.
>
> Is this expected?  From my perspective, it makes quotas pretty much
> unusable at least during a rebalance.  I have a full 'perf record'
> profile with call graphs if it would be helpful.
>
>
Thanks for the report.
Again balance, the devil is again(always) in the balance.
To be honest, balance itself is already complicated enough, and not a 
friendly neighborhood for a lot of function.
(Yeah, a lot of dedupe bugs are and can only be triggered by balance, 
and I can't hate it any more)
Perf record profile will help a lot.
Please upload it if it's OK for you.
Also, the following data would help a lot:
1) btrfs fi df output
    To determine the metadata/data ratio
    Balancing metadata should be quite slow.
2) btrfs subvolume list output
    To determine how many tree blocks are shared against each other
    More shared tree blocks, slower quota routing is.
    Feel free to mask the output to avoid information leaking.
3) perf record for balancing metadata and data respectively
    Although this is optional. Just to prove some assumption.
btrfs_find_all_roots() is quite a slow operation, and in its worst case
(tree blocks are shared by a lot of trees) it may be near O(2^n).
For normal operation, it's pretty hard to trigger so many extents 
creation/deletion/reference.
Even such case happens, it will cause much much much more IO, making the 
most time waiting IO other than doing quota accounting.
But for balance, especially for metadata balancing, IO is small but 
amount of extents is very high, making most the time consumed by 
find_all_roots().
I can try to make balance to bypass quota routine, but I'm not sure if 
such operation will open another hole to make quota crazy again, and 
only much much much more tests can prove it. :(
Thanks,
Qu
next prev parent reply	other threads:[~2016-03-17  1:36 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-16 20:53 qgroup code slowing down rebalance Dave Hansen
2016-03-17  1:36 ` Qu Wenruo [this message]
2016-03-17 16:36   ` Dave Hansen
2016-03-18  1:02     ` Qu Wenruo
2016-03-18 16:33       ` Dave Hansen
2016-03-21  1:44         ` Qu Wenruo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox
  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):
  git send-email \
    --in-reply-to=56EA0A0F.8070008@cn.fujitsu.com \
    --to=quwenruo@cn.fujitsu.com \
    --cc=clm@fb.com \
    --cc=dave@sr71.net \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY
  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
  Be sure your reply has a Subject: header at the top and a blank line
  before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).