From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>,
linux-kernel@vger.kernel.org, Jan Kara <jack@suse.com>,
Andrew Morton <akpm@linux-foundation.org>,
kernel-team@fb.com
Subject: Re: [PATCH] bdi: Move cgroup bdi_writeback to a dedicated low concurrency workqueue
Date: Wed, 23 May 2018 11:39:07 -0700 [thread overview]
Message-ID: <20180523183907.GZ3803@linux.vnet.ibm.com> (raw)
In-Reply-To: <20180523175632.GO1718769@devbig577.frc2.facebook.com>
On Wed, May 23, 2018 at 10:56:32AM -0700, Tejun Heo wrote:
> >From 0aa2e9b921d6db71150633ff290199554f0842a8 Mon Sep 17 00:00:00 2001
> From: Tejun Heo <tj@kernel.org>
> Date: Wed, 23 May 2018 10:29:00 -0700
>
> cgwb_release() punts the actual release to cgwb_release_workfn() on
> system_wq. Depending on the number of cgroups or block devices, there
> can be a lot of cgwb_release_workfn() in flight at the same time.
>
> We're periodically seeing close to 256 kworkers getting stuck with the
> following stack trace and overtime the entire system gets stuck.
>
> [<ffffffff810ee40c>] _synchronize_rcu_expedited.constprop.72+0x2fc/0x330
> [<ffffffff810ee634>] synchronize_rcu_expedited+0x24/0x30
> [<ffffffff811ccf23>] bdi_unregister+0x53/0x290
> [<ffffffff811cd1e9>] release_bdi+0x89/0xc0
> [<ffffffff811cd645>] wb_exit+0x85/0xa0
> [<ffffffff811cdc84>] cgwb_release_workfn+0x54/0xb0
> [<ffffffff810a68d0>] process_one_work+0x150/0x410
> [<ffffffff810a71fd>] worker_thread+0x6d/0x520
> [<ffffffff810ad3dc>] kthread+0x12c/0x160
> [<ffffffff81969019>] ret_from_fork+0x29/0x40
> [<ffffffffffffffff>] 0xffffffffffffffff
>
> The events leading to the lockup are...
>
> 1. A lot of cgwb_release_workfn() is queued at the same time and all
> system_wq kworkers are assigned to execute them.
>
> 2. They all end up calling synchronize_rcu_expedited(). One of them
> wins and tries to perform the expedited synchronization.
>
> 3. However, that invovles queueing rcu_exp_work to system_wq and
> waiting for it. Because #1 is holding all available kworkers on
> system_wq, rcu_exp_work can't be executed. cgwb_release_workfn()
> is waiting for synchronize_rcu_expedited() which in turn is waiting
> for cgwb_release_workfn() to free up some of the kworkers.
>
> We shouldn't be scheduling hundreds of cgwb_release_workfn() at the
> same time. There's nothing to be gained from that. This patch
> updates cgwb release path to use a dedicated percpu workqueue with
> @max_active of 1.
>
> While this resolves the problem at hand, it might be a good idea to
> isolate rcu_exp_work to its own workqueue too as it can be used from
> various paths and is prone to this sort of indirect A-A deadlocks.
Commit ad7c946b35ad4 ("rcu: Create RCU-specific workqueues with rescuers")
was accepted into mainline this past merge window. Does that do what
you want, or are you looking for something else?
Thanx, Paul
next prev parent reply other threads:[~2018-05-23 18:48 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-05-23 17:56 [PATCH] bdi: Move cgroup bdi_writeback to a dedicated low concurrency workqueue Tejun Heo
2018-05-23 18:39 ` Paul E. McKenney [this message]
2018-05-23 18:51 ` Tejun Heo
2018-05-23 19:10 ` Paul E. McKenney
2018-05-23 21:29 ` Jens Axboe
2018-05-23 22:03 ` Rik van Riel
2018-05-23 23:17 ` Tejun Heo
2018-05-23 23:25 ` [PATCH] bdi: Increase the concurrecy level of cgwb_release_wq Tejun Heo
2018-05-24 10:19 ` [PATCH] bdi: Move cgroup bdi_writeback to a dedicated low concurrency workqueue Jan Kara
2018-05-24 14:00 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180523183907.GZ3803@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=axboe@kernel.dk \
--cc=jack@suse.com \
--cc=kernel-team@fb.com \
--cc=linux-kernel@vger.kernel.org \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox