From: bsegall@google.com
To: Phil Auld <pauld@redhat.com>
Cc: mingo@redhat.com, peterz@infradead.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC] sched/fair: hard lockup in sched_cfs_period_timer
Date: Wed, 06 Mar 2019 11:25:02 -0800 [thread overview]
Message-ID: <xm26wolbyfe9.fsf@bsegall-linux.svl.corp.google.com> (raw)
In-Reply-To: <20190306162313.GB8786@pauld.bos.csb> (Phil Auld's message of "Wed, 6 Mar 2019 11:23:13 -0500")
Phil Auld <pauld@redhat.com> writes:
> On Tue, Mar 05, 2019 at 12:45:34PM -0800 bsegall@google.com wrote:
>> Phil Auld <pauld@redhat.com> writes:
>>
>> > Interestingly, if I limit the number of child cgroups to the number of
>> > them I'm actually putting processes into (16 down from 2500) the problem
>> > does not reproduce.
>>
>> That is indeed interesting, and definitely not something we'd want to
>> matter. (Particularly if it's not root->a->b->c...->throttled_cgroup or
>> root->throttled->a->...->thread vs root->throttled_cgroup, which is what
>> I was originally thinking of)
>>
>
> The locking may be a red herring.
>
> The setup is root->throttled->a where a is 1-2500. There are 4 threads in
> each of the first 16 a groups. The parent, throttled, is where the
> cfs_period/quota_us are set.
>
> I wonder if the problem is the walk_tg_tree_from() call in unthrottle_cfs_rq().
>
> The distribute_cfg_runtime looks to be O(n * m) where n is number of
> throttled cfs_rqs and m is the number of child cgroups. But I'm not
> completely clear on how the hierarchical cgroups play together here.
>
> I'll pull on this thread some.
>
> Thanks for your input.
>
>
> Cheers,
> Phil
Yeah, that isn't under the cfs_b lock, but is still part of distribute
(and under rq lock, which might also matter). I was thinking too much
about just the cfs_b regions. I'm not sure there's any good general
optimization there.
I suppose cfs_rqs (tgs/cfs_bs?) could have "nearest
ancestor with a quota" pointer and ones with quota could have
"descendants with quota" list, parallel to the children/parent lists of
tgs. Then throttle/unthrottle would only have to visit these lists, and
child cgroups/cfs_rqs without their own quotas would just check
cfs_rq->nearest_quota_cfs_rq->throttle_count. throttled_clock_task_time
can also probably be tracked there.
next prev parent reply other threads:[~2019-03-06 19:25 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-03-01 14:52 [RFC] sched/fair: hard lockup in sched_cfs_period_timer Phil Auld
2019-03-04 18:13 ` bsegall
2019-03-04 19:05 ` Phil Auld
2019-03-05 18:49 ` bsegall
2019-03-05 20:05 ` Phil Auld
2019-03-05 20:45 ` bsegall
2019-03-06 16:23 ` Phil Auld
2019-03-06 19:25 ` bsegall [this message]
2019-03-09 20:33 ` Phil Auld
2019-03-11 17:44 ` bsegall
2019-03-11 20:25 ` Phil Auld
2019-03-12 13:57 ` Phil Auld
2019-03-13 17:44 ` bsegall
2019-03-13 18:50 ` Phil Auld
2019-03-13 20:26 ` bsegall
2019-03-13 21:10 ` Phil Auld
2019-03-12 17:29 ` bsegall
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xm26wolbyfe9.fsf@bsegall-linux.svl.corp.google.com \
--to=bsegall@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=pauld@redhat.com \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.