From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, UNPARSEABLE_RELAY,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 95AA0C3F2CD for ; Tue, 3 Mar 2020 14:17:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 71C702083E for ; Tue, 3 Mar 2020 14:17:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728470AbgCCORs (ORCPT ); Tue, 3 Mar 2020 09:17:48 -0500 Received: from out30-130.freemail.mail.aliyun.com ([115.124.30.130]:45966 "EHLO out30-130.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727370AbgCCORr (ORCPT ); Tue, 3 Mar 2020 09:17:47 -0500 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R631e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04426;MF=yun.wang@linux.alibaba.com;NM=1;PH=DS;RN=9;SR=0;TI=SMTPD_---0TrZAYVM_1583245023; Received: from testdeMacBook-Pro.local(mailfrom:yun.wang@linux.alibaba.com fp:SMTPD_---0TrZAYVM_1583245023) by smtp.aliyun-inc.com(127.0.0.1); Tue, 03 Mar 2020 22:17:18 +0800 To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , "open list:SCHEDULER" From: =?UTF-8?B?546L6LSH?= Subject: [RFC PATCH] sched: fix the nonsense shares when load of cfs_rq is too, small Message-ID: <44fa1cee-08db-e4ab-e5ab-08d6fbd421d7@linux.alibaba.com> Date: Tue, 3 Mar 2020 22:17:03 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:68.0) Gecko/20100101 Thunderbird/68.4.1 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org During our testing, we found a case that shares no longer working correctly, the cgroup topology is like: /sys/fs/cgroup/cpu/A (shares=102400) /sys/fs/cgroup/cpu/A/B (shares=2) /sys/fs/cgroup/cpu/A/B/C (shares=1024) /sys/fs/cgroup/cpu/D (shares=1024) /sys/fs/cgroup/cpu/D/E (shares=1024) /sys/fs/cgroup/cpu/D/E/F (shares=1024) The same benchmark is running in group C & F, no other tasks are running, the benchmark is capable to consumed all the CPUs. We suppose the group C will win more CPU resources since it could enjoy all the shares of group A, but it's F who wins much more. The reason is because we have group B with shares as 2, which make the group A 'cfs_rq->load.weight' very small. And in calc_group_shares() we calculate shares as: load = max(scale_load_down(cfs_rq->load.weight), cfs_rq->avg.load_avg); shares = (tg_shares * load) / tg_weight; Since the 'cfs_rq->load.weight' is too small, the load become 0 in here, although 'tg_shares' is 102400, shares of the se which stand for group A on root cfs_rq become 2. While the se of D on root cfs_rq is far more bigger than 2, so it wins the battle. This patch add a check on the zero load and make it as MIN_SHARES to fix the nonsense shares, after applied the group C wins as expected. Signed-off-by: Michael Wang --- kernel/sched/fair.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 84594f8aeaf8..53d705f75fa4 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -3182,6 +3182,8 @@ static long calc_group_shares(struct cfs_rq *cfs_rq) tg_shares = READ_ONCE(tg->shares); load = max(scale_load_down(cfs_rq->load.weight), cfs_rq->avg.load_avg); + if (!load && cfs_rq->load.weight) + load = MIN_SHARES; tg_weight = atomic_long_read(&tg->load_avg); -- 2.14.4.44.g2045bb6