From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6271CC43334 for ; Thu, 16 Jun 2022 08:12:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:Date: Message-ID:From:References:To:Subject:CC:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=ioRQNEU02/MJcdvqW+wrUQ1wvEs8ChTqAOXYnF9Ll5k=; b=0W193jKDt5MBsAt0MktAqzmj1t Ur+DL2YM7sXIQd5xcvycjReJ2uYkz21uYnObKsVp94rMd98kkR6OhNHNynltFKoJX6X6sG91Ii4Fh 06rHeFId9weD66S9W2AA0wTQvq9wIkTZzIKsg4gxIxUFJFIJApeunhAo8fLOUnI+KNI5vrn9ehNQj IvBa/A1wFcj6RYGVRk9nJNdSYlax7cpNIRRsup5I65xd9ycvhJUvBLTVPniXLJ8UltpSvZZanSUO7 TtoNBbnVolviQ0M/bCv38bAGHSaxaR9vZ9jpD2Gpzdak1baEO+9WCxWCkJYFK73E5lxnKcLklpvQQ RsI6UJoQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1o1kar-001K6Q-PA; Thu, 16 Jun 2022 08:11:10 +0000 Received: from szxga02-in.huawei.com ([45.249.212.188]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1o1kaf-001K2R-8d for linux-arm-kernel@lists.infradead.org; Thu, 16 Jun 2022 08:10:59 +0000 Received: from canpemm500009.china.huawei.com (unknown [172.30.72.54]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4LNvw04yRfzjXf3; Thu, 16 Jun 2022 16:09:20 +0800 (CST) Received: from [10.67.102.169] (10.67.102.169) by canpemm500009.china.huawei.com (7.192.105.203) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Thu, 16 Jun 2022 16:10:52 +0800 CC: , , , , , , , , , , , , , , , , , <21cnbao@gmail.com>, , , , Subject: Re: [PATCH v4 1/2] sched: Add per_cpu cluster domain info and cpus_share_resources API To: "Gautham R. Shenoy" , K Prateek Nayak References: <20220609120622.47724-1-yangyicong@hisilicon.com> <20220609120622.47724-2-yangyicong@hisilicon.com> From: Yicong Yang Message-ID: Date: Thu, 16 Jun 2022 16:10:51 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.5.1 MIME-Version: 1.0 In-Reply-To: X-Originating-IP: [10.67.102.169] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To canpemm500009.china.huawei.com (7.192.105.203) X-CFilter-Loop: Reflected X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220616_011057_737160_FAC6562E X-CRM114-Status: GOOD ( 17.76 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On 2022/6/15 23:43, Gautham R. Shenoy wrote: > On Wed, Jun 15, 2022 at 07:49:22PM +0530, K Prateek Nayak wrote: > > [..snip..] > >> >> - Bisecting: >> >> When we ran the tests with only Patch 1 of the series, the >> regression was visible and the numbers were worse. >> >> Clients: tip cluster Patch 1 Only >> 8 3263.81 (0.00 pct) 3086.81 (-5.42 pct) 3018.63 (-7.51 pct) >> 16 6011.19 (0.00 pct) 5360.28 (-10.82 pct) 4869.26 (-18.99 pct) >> 32 12058.31 (0.00 pct) 8769.08 (-27.27 pct) 8159.60 (-32.33 pct) >> 64 21258.21 (0.00 pct) 19021.09 (-10.52 pct) 13161.92 (-38.08 pct) >> >> We further bisected the hunks to narrow down the cause to the per CPU >> variable declarations. >> >> >>> >>> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h >>> index 01259611beb9..b9bcfcf8d14d 100644 >>> --- a/kernel/sched/sched.h >>> +++ b/kernel/sched/sched.h >>> @@ -1753,7 +1753,9 @@ static inline struct sched_domain *lowest_flag_domain(int cpu, int flag) >>> DECLARE_PER_CPU(struct sched_domain __rcu *, sd_llc); >>> DECLARE_PER_CPU(int, sd_llc_size); >>> DECLARE_PER_CPU(int, sd_llc_id); >>> +DECLARE_PER_CPU(int, sd_share_id); >>> DECLARE_PER_CPU(struct sched_domain_shared __rcu *, sd_llc_shared); >>> +DECLARE_PER_CPU(struct sched_domain __rcu *, sd_cluster); >> >> The main reason for the regression seems to be the above declarations. > > I think you meant that the regressions are due to the DEFINE_PER_CPU() > instances from the following hunk: > >>> @@ -664,6 +664,8 @@ static void destroy_sched_domains(struct sched_domain *sd) >>> DEFINE_PER_CPU(struct sched_domain __rcu *, sd_llc); >>> DEFINE_PER_CPU(int, sd_llc_size); >>> DEFINE_PER_CPU(int, sd_llc_id); >>> +DEFINE_PER_CPU(int, sd_share_id); >>> +DEFINE_PER_CPU(struct sched_domain __rcu *, sd_cluster); >>> DEFINE_PER_CPU(struct sched_domain_shared __rcu *, sd_llc_shared); >>> DEFINE_PER_CPU(struct sched_domain __rcu *, sd_numa); >>> DEFINE_PER_CPU(struct sched_domain __rcu *, sd_asym_packing); >>> > > > The System.map diff for these variables between tip vs tip + > cluster-sched-v4 on your test system looks as follows: > > 0000000000020520 D sd_asym_packing > 0000000000020528 D sd_numa > -0000000000020530 D sd_llc_shared > -0000000000020538 D sd_llc_id > -000000000002053c D sd_llc_size > -0000000000020540 D sd_llc > +0000000000020530 D sd_cluster > +0000000000020538 D sd_llc_shared looks like below are in another cacheline (for 64B cacheline)? while previous sd_llc_id and sd_llc_shared are in the same. > +0000000000020540 D sd_share_id > +0000000000020544 D sd_llc_id > +0000000000020548 D sd_llc_size > +0000000000020550 D sd_llc > > The allocations are in the reverse-order of the definitions. > > That perhaps explains why you no longer see the regression when you > define the sd_share_id and sd_cluster per-cpu definitions at the > beginning as indicated by the following > >> - Move the declarations of sd_share_id and sd_cluster to the top >> >> Clients: tip Patch 1 Patch 1 (Declarion on Top) >> 8 3255.69 (0.00 pct) 3018.63 (-7.28 pct) 3072.30 (-5.63 pct) >> 16 6092.67 (0.00 pct) 4869.26 (-20.08 pct) 5586.59 (-8.30 pct) >> 32 11156.56 (0.00 pct) 8159.60 (-26.86 pct) 11184.17 (0.24 pct) >> 64 21019.97 (0.00 pct) 13161.92 (-37.38 pct) 20289.70 (-3.47 pct) > > > -- > Thanks and Regards > gautham. > . > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel