Re: Debugging abysmal write performance with 100% cpu kworker/u16:X+flush-btrfs-2

public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed

From: Hans van Kranenburg <hans@knorrie.org>
To: "Holger Hoffstätte" <holger@applied-asynchrony.com>,
	linux-btrfs@vger.kernel.org
Subject: Re: Debugging abysmal write performance with 100% cpu kworker/u16:X+flush-btrfs-2
Date: Sat, 25 Jul 2020 23:03:06 +0200	[thread overview]
Message-ID: <b7c807f5-3741-d7ff-be4f-acb7dbbc7bb1@knorrie.org> (raw)
In-Reply-To: <d3be20df-2f97-6fa8-7050-7315f7ab27a5@applied-asynchrony.com>

On 7/25/20 9:44 PM, Holger Hoffstätte wrote:
> On 2020-07-25 18:43, Hans van Kranenburg wrote:
>> On 7/25/20 5:37 PM, Holger Hoffstätte wrote:
>>>> [<0>] rq_qos_wait+0xfa/0x170
>>>> [<0>] wbt_wait+0x98/0xe0
>>>> [<0>] __rq_qos_throttle+0x23/0x30
>>
>> I need to cat /proc/<pid>/stack a huge number of times in a loop to once
>> in a while get this sort of output shown.
> 
> Ahh! So that means you are just getting lucky and catch the occasional
> throttle in action. Ok.
> 
>>> You can tune the expected latency of device writes via:
>>> /sys/block/sda/queue/wbt_lat_usec.
>>
>> Yes, I have been playing around with it earlier, without any effect on
>> the symptoms.
>>
>> I just did this again, echo 0 > all of the involved block devices. When
>> looking at the events/wbt trace point, I see that wbt activity stops at
>> that moment.
>>
>> No difference in symptoms.
> 
> Ok. Worth a shot..but in retrospect not really, since I just realized
> that I had your entire scenario backwards. Nevermind then.

np

>> Dirty buffers were ~ 2G in size. I can modify the numbers to make it
>> bigger or smaller. There's absolutely no change in behavior of the system.
> 
> Ok. I also have 2GB max-dirty and 1GB writeback-dirty, which is
> plenty for a 10G pipe.
> 
>> It doesn't. It's idle, waiting to finally get some data sent to it.
> 
> Yup, looks like it.
> 
>> All processing speed is inversely proportional to the cpu usage of this
>> kworker/u16:X+flush-btrfs-2 thread. If it reaches >95% kernel cpu usage,
>> everything slows down. The network is idle, the disks are idle. Incoming
>> rsync speed drops, the speed in which btrfs receive is reading input
>> drops, etc. As soon as kworker/u16:X+flush-btrfs-2 busy cpu usage gets
>> below ~ 95% again, throughput goes up.
> 
> Only thing I can think of is that your rsync runs create insane amounts
> of small COW extents that need to be ordered/merged. Multiply by number of
> processes and you're probably hitting some super contended part.

Yes, that's what we're looking at and I see that my other email sent to
the list some hours ago didn't show up yet. Maybe I should resend with a
link to the attachment instead of carrying an attachment, or I'll wait
until tomorrow.

Hm,

Jul 25 21:06:44 syrinx postfix/smtp[32364]: 155C4609B67CF:
to=<linux-btrfs@vger.kernel.org>,
relay=vger.kernel.org[2620:137:e000::1:18]:25, delay=2.2, delays=0
.15/0.02/1.2/0.87, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as
7B1D9C08C5C0)
Jul 25 21:06:44 syrinx postfix/qmgr[951]: 155C4609B67CF: removed

O_o, so it was accepted, but I don't see it.

> Since
> the kworker isn't stuck forever but apparently makes progress it's not
> dead, just slow/overloaded.
> 
> A few years ago I started using rsync exclusively with --whole-file since
> it's not just much faster but also creates significantly less fragmentation
> at the expense of some disk space, but whatever...in my case snapshot rotation
> takes care of that. Maybe it's an option for you.
> 
> So maybe things to try:
> 
> - run rsync with --whole-file
> - run fewer rsyncs in parallel (might not be necessary with --whole-file!)
> 
> If that doesn't help someone else needs to chime in..

That someone else could be me if I only had the work time available to
do it, instead of using hobby time which since a few months is rather
used to optimize the vegetable garden and cutting down trees for fire
wood together with the neighbours.

And yes, all rsync is already running with --whole-file since 2014 to
reduce the total amount of reflink garbage in this now ~100T pile of
stuff with >200k subvols.

Doing fewer things in parallel does help, indeed. If I abort everything
and just run 1 of the 20 at a time, it does something. But, I want to
get this over with before the year 2050 happens.

K

next prev parent reply	other threads:[~2020-07-25 21:03 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-25 14:24 Debugging abysmal write performance with 100% cpu kworker/u16:X+flush-btrfs-2 Hans van Kranenburg
2020-07-25 15:37 ` Holger Hoffstätte
2020-07-25 16:43   ` Hans van Kranenburg
2020-07-25 19:44     ` Holger Hoffstätte
2020-07-25 21:03       ` Hans van Kranenburg [this message]
2020-07-26  1:00         ` Chris Murphy
2020-07-25 21:27 ` Hans van Kranenburg
2020-07-26  8:10   ` A L
2020-07-26  0:50 ` Chris Murphy
2020-07-27 11:09 ` Qu Wenruo
2020-07-27 17:17   ` Hans van Kranenburg
2020-07-27 19:23     ` Chris Murphy
2020-07-27 23:16     ` Chris Murphy
2020-07-28  0:51     ` Qu Wenruo
2020-07-28  1:52       ` Qu Wenruo
2020-07-28 14:52         ` Hans van Kranenburg
2020-07-29  0:15           ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b7c807f5-3741-d7ff-be4f-acb7dbbc7bb1@knorrie.org \
    --to=hans@knorrie.org \
    --cc=holger@applied-asynchrony.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox