From: Hans van Kranenburg <hans@knorrie.org>
To: "Holger Hoffstätte" <holger@applied-asynchrony.com>,
linux-btrfs@vger.kernel.org
Subject: Re: Debugging abysmal write performance with 100% cpu kworker/u16:X+flush-btrfs-2
Date: Sat, 25 Jul 2020 23:03:06 +0200 [thread overview]
Message-ID: <b7c807f5-3741-d7ff-be4f-acb7dbbc7bb1@knorrie.org> (raw)
In-Reply-To: <d3be20df-2f97-6fa8-7050-7315f7ab27a5@applied-asynchrony.com>
On 7/25/20 9:44 PM, Holger Hoffstätte wrote:
> On 2020-07-25 18:43, Hans van Kranenburg wrote:
>> On 7/25/20 5:37 PM, Holger Hoffstätte wrote:
>>>> [<0>] rq_qos_wait+0xfa/0x170
>>>> [<0>] wbt_wait+0x98/0xe0
>>>> [<0>] __rq_qos_throttle+0x23/0x30
>>
>> I need to cat /proc/<pid>/stack a huge number of times in a loop to once
>> in a while get this sort of output shown.
>
> Ahh! So that means you are just getting lucky and catch the occasional
> throttle in action. Ok.
>
>>> You can tune the expected latency of device writes via:
>>> /sys/block/sda/queue/wbt_lat_usec.
>>
>> Yes, I have been playing around with it earlier, without any effect on
>> the symptoms.
>>
>> I just did this again, echo 0 > all of the involved block devices. When
>> looking at the events/wbt trace point, I see that wbt activity stops at
>> that moment.
>>
>> No difference in symptoms.
>
> Ok. Worth a shot..but in retrospect not really, since I just realized
> that I had your entire scenario backwards. Nevermind then.
np
>> Dirty buffers were ~ 2G in size. I can modify the numbers to make it
>> bigger or smaller. There's absolutely no change in behavior of the system.
>
> Ok. I also have 2GB max-dirty and 1GB writeback-dirty, which is
> plenty for a 10G pipe.
>
>> It doesn't. It's idle, waiting to finally get some data sent to it.
>
> Yup, looks like it.
>
>> All processing speed is inversely proportional to the cpu usage of this
>> kworker/u16:X+flush-btrfs-2 thread. If it reaches >95% kernel cpu usage,
>> everything slows down. The network is idle, the disks are idle. Incoming
>> rsync speed drops, the speed in which btrfs receive is reading input
>> drops, etc. As soon as kworker/u16:X+flush-btrfs-2 busy cpu usage gets
>> below ~ 95% again, throughput goes up.
>
> Only thing I can think of is that your rsync runs create insane amounts
> of small COW extents that need to be ordered/merged. Multiply by number of
> processes and you're probably hitting some super contended part.
Yes, that's what we're looking at and I see that my other email sent to
the list some hours ago didn't show up yet. Maybe I should resend with a
link to the attachment instead of carrying an attachment, or I'll wait
until tomorrow.
Hm,
Jul 25 21:06:44 syrinx postfix/smtp[32364]: 155C4609B67CF:
to=<linux-btrfs@vger.kernel.org>,
relay=vger.kernel.org[2620:137:e000::1:18]:25, delay=2.2, delays=0
.15/0.02/1.2/0.87, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as
7B1D9C08C5C0)
Jul 25 21:06:44 syrinx postfix/qmgr[951]: 155C4609B67CF: removed
O_o, so it was accepted, but I don't see it.
> Since
> the kworker isn't stuck forever but apparently makes progress it's not
> dead, just slow/overloaded.
>
> A few years ago I started using rsync exclusively with --whole-file since
> it's not just much faster but also creates significantly less fragmentation
> at the expense of some disk space, but whatever...in my case snapshot rotation
> takes care of that. Maybe it's an option for you.
>
> So maybe things to try:
>
> - run rsync with --whole-file
> - run fewer rsyncs in parallel (might not be necessary with --whole-file!)
>
> If that doesn't help someone else needs to chime in..
That someone else could be me if I only had the work time available to
do it, instead of using hobby time which since a few months is rather
used to optimize the vegetable garden and cutting down trees for fire
wood together with the neighbours.
And yes, all rsync is already running with --whole-file since 2014 to
reduce the total amount of reflink garbage in this now ~100T pile of
stuff with >200k subvols.
Doing fewer things in parallel does help, indeed. If I abort everything
and just run 1 of the 20 at a time, it does something. But, I want to
get this over with before the year 2050 happens.
K
next prev parent reply other threads:[~2020-07-25 21:03 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-25 14:24 Debugging abysmal write performance with 100% cpu kworker/u16:X+flush-btrfs-2 Hans van Kranenburg
2020-07-25 15:37 ` Holger Hoffstätte
2020-07-25 16:43 ` Hans van Kranenburg
2020-07-25 19:44 ` Holger Hoffstätte
2020-07-25 21:03 ` Hans van Kranenburg [this message]
2020-07-26 1:00 ` Chris Murphy
2020-07-25 21:27 ` Hans van Kranenburg
2020-07-26 8:10 ` A L
2020-07-26 0:50 ` Chris Murphy
2020-07-27 11:09 ` Qu Wenruo
2020-07-27 17:17 ` Hans van Kranenburg
2020-07-27 19:23 ` Chris Murphy
2020-07-27 23:16 ` Chris Murphy
2020-07-28 0:51 ` Qu Wenruo
2020-07-28 1:52 ` Qu Wenruo
2020-07-28 14:52 ` Hans van Kranenburg
2020-07-29 0:15 ` Qu Wenruo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b7c807f5-3741-d7ff-be4f-acb7dbbc7bb1@knorrie.org \
--to=hans@knorrie.org \
--cc=holger@applied-asynchrony.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox