From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 29D07C433B4 for ; Tue, 13 Apr 2021 10:48:49 +0000 (UTC) Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A837661242 for ; Tue, 13 Apr 2021 10:48:48 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A837661242 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=hisilicon.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=desiato.20200630; h=Sender:Content-Transfer-Encoding :Content-Type:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:MIME-Version:In-Reply-To:References:Message-ID:Date: Subject:CC:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=nHvRaCws1YIAMDpowDV27xmIEOOyFp/ZC25RiSsiXRQ=; b=lDj8vK+skHvYWao0a68jSvzcg NSJGwNTQy6BCWmFog5HTNG+skWSWJsqbcF1t+SyEW2MSHCcWnewUyICZnzFJsXiIqeHGQKKGhFqSW YLkhPOw+eSc0jM0Q5TzHm6TS30RW/HtgD/MYDqiL3iqab4nTToJpxxQ3CDqoKEqOsKx8/c9FBsVDQ 1Z/3m85qUqXjF3hlB34RqN/F+klGF/W1v49Vr/C/A5UCEngZ6p9pHGqvs4W7ehxLHurqjWfoW99kY xfkjxHXIWvvpV3iSaBb4WJ6wgrxGIpLYmzwDQh0h8DTda6zcEzDwgW1qaXERteK2WxDwVRfHxPfjY B0j0XimuQ==; Received: from localhost ([::1] helo=desiato.infradead.org) by desiato.infradead.org with esmtp (Exim 4.94 #2 (Red Hat Linux)) id 1lWGZE-008twe-HB; Tue, 13 Apr 2021 10:46:48 +0000 Received: from bombadil.infradead.org ([2607:7c80:54:e::133]) by desiato.infradead.org with esmtps (Exim 4.94 #2 (Red Hat Linux)) id 1lWGYQ-008tnz-Uc for linux-arm-kernel@desiato.infradead.org; Tue, 13 Apr 2021 10:45:59 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=MIME-Version: Content-Transfer-Encoding:Content-Type:In-Reply-To:References:Message-ID:Date :Subject:CC:To:From:Sender:Reply-To:Content-ID:Content-Description; bh=hsfIdnOAQtA6DAAdn00d2sx52OGLdUcnoHIbY8nmn0E=; b=ymHj4TaP6ehY/oHnNvr2mKkHP+ 22rOFCzEEPmwn96957b7Mh06UI/WxF2inJtPNtNVoq1YV1ijtnljqdwHI5hCIBnXpdL7S2+FR1Pqw 1kefmOhyKWIMH+ubjYxs1qktkD9iZANmLeFO9xBWMBrk1VcPCyTpqXwei6IbgyNiUP3BAhK8fCFh7 ZM8XJHDkCJ0+/D2CpCVDBRf6G3MaUJz1HjoTrk6v4lEpXEt4whlp3+UCzZD5mNBhb9yPacOoKXxDU z64ljwbIxW6r2H79zaOfT1F9B5gnIShjT7yqXdSFfy4iATlnu5rkjYur8WJyXbZl+05ifruhccR/e p4ktniCw==; Received: from frasgout.his.huawei.com ([185.176.79.56]) by bombadil.infradead.org with esmtps (Exim 4.94 #2 (Red Hat Linux)) id 1lWGYN-006vzu-J2 for linux-arm-kernel@lists.infradead.org; Tue, 13 Apr 2021 10:45:57 +0000 Received: from fraeml737-chm.china.huawei.com (unknown [172.18.147.200]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4FKMXC44g5z688L2; Tue, 13 Apr 2021 18:38:35 +0800 (CST) Received: from lhreml719-chm.china.huawei.com (10.201.108.70) by fraeml737-chm.china.huawei.com (10.206.15.218) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2106.2; Tue, 13 Apr 2021 12:45:48 +0200 Received: from dggemi761-chm.china.huawei.com (10.1.198.147) by lhreml719-chm.china.huawei.com (10.201.108.70) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256) id 15.1.2106.2; Tue, 13 Apr 2021 11:45:46 +0100 Received: from dggemi761-chm.china.huawei.com ([10.9.49.202]) by dggemi761-chm.china.huawei.com ([10.9.49.202]) with mapi id 15.01.2106.013; Tue, 13 Apr 2021 18:45:44 +0800 From: "Song Bao Hua (Barry Song)" To: Dietmar Eggemann , Morten Rasmussen , Tim Chen CC: "valentin.schneider@arm.com" , "catalin.marinas@arm.com" , "will@kernel.org" , "rjw@rjwysocki.net" , "vincent.guittot@linaro.org" , "lenb@kernel.org" , "gregkh@linuxfoundation.org" , Jonathan Cameron , "mingo@redhat.com" , "peterz@infradead.org" , "juri.lelli@redhat.com" , "rostedt@goodmis.org" , "bsegall@google.com" , "mgorman@suse.de" , "mark.rutland@arm.com" , "sudeep.holla@arm.com" , "aubrey.li@linux.intel.com" , "linux-arm-kernel@lists.infradead.org" , "linux-kernel@vger.kernel.org" , "linux-acpi@vger.kernel.org" , "linuxarm@openeuler.org" , "xuwei (O)" , "Zengtao (B)" , "tiantao (H)" , "Guodong Xu" , yangyicong Subject: RE: [RFC PATCH v3 0/2] scheduler: expose the topology of clusters and add cluster scheduler Thread-Topic: [RFC PATCH v3 0/2] scheduler: expose the topology of clusters and add cluster scheduler Thread-Index: AQHW5/wlPZ7BCMS2PUiSVVUiYd0GPqojTioAgI99GlA= Date: Tue, 13 Apr 2021 10:45:44 +0000 Message-ID: <9201b56a29dd4dacb7d9fcbf307ca5ff@hisilicon.com> References: <20210106083026.40444-1-song.bao.hua@hisilicon.com> <737932c9-846a-0a6b-08b8-e2d2d95b67ce@linux.intel.com> <20210108151241.GA47324@e123083-lin> <99c07bdf-02d1-153a-bd1e-2f4200cc67c5@linux.intel.com> <20210111092811.GB47324@e123083-lin> <4fdc781e-7385-2ae6-d9c9-3ec165f473c4@arm.com> In-Reply-To: <4fdc781e-7385-2ae6-d9c9-3ec165f473c4@arm.com> Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.126.201.93] MIME-Version: 1.0 X-CFilter-Loop: Reflected X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210413_034555_949346_2802249F X-CRM114-Status: GOOD ( 30.44 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org > -----Original Message----- > From: Dietmar Eggemann [mailto:dietmar.eggemann@arm.com] > Sent: Wednesday, January 13, 2021 12:00 AM > To: Morten Rasmussen ; Tim Chen > > Cc: Song Bao Hua (Barry Song) ; > valentin.schneider@arm.com; catalin.marinas@arm.com; will@kernel.org; > rjw@rjwysocki.net; vincent.guittot@linaro.org; lenb@kernel.org; > gregkh@linuxfoundation.org; Jonathan Cameron ; > mingo@redhat.com; peterz@infradead.org; juri.lelli@redhat.com; > rostedt@goodmis.org; bsegall@google.com; mgorman@suse.de; > mark.rutland@arm.com; sudeep.holla@arm.com; aubrey.li@linux.intel.com; > linux-arm-kernel@lists.infradead.org; linux-kernel@vger.kernel.org; > linux-acpi@vger.kernel.org; linuxarm@openeuler.org; xuwei (O) > ; Zengtao (B) ; tiantao (H) > > Subject: Re: [RFC PATCH v3 0/2] scheduler: expose the topology of clusters and > add cluster scheduler > > On 11/01/2021 10:28, Morten Rasmussen wrote: > > On Fri, Jan 08, 2021 at 12:22:41PM -0800, Tim Chen wrote: > >> > >> > >> On 1/8/21 7:12 AM, Morten Rasmussen wrote: > >>> On Thu, Jan 07, 2021 at 03:16:47PM -0800, Tim Chen wrote: > >>>> On 1/6/21 12:30 AM, Barry Song wrote: > > [...] > > >> I think it is going to depend on the workload. If there are dependent > >> tasks that communicate with one another, putting them together > >> in the same cluster will be the right thing to do to reduce communication > >> costs. On the other hand, if the tasks are independent, putting them together > on the same cluster > >> will increase resource contention and spreading them out will be better. > > > > Agree. That is exactly where I'm coming from. This is all about the task > > placement policy. We generally tend to spread tasks to avoid resource > > contention, SMT and caches, which seems to be what you are proposing to > > extend. I think that makes sense given it can produce significant > > benefits. > > > >> > >> Any thoughts on what is the right clustering "tag" to use to clump > >> related tasks together? > >> Cgroup? Pid? Tasks with same mm? > > > > I think this is the real question. I think the closest thing we have at > > the moment is the wakee/waker flip heuristic. This seems to be related. > > Perhaps the wake_affine tricks can serve as starting point? > > wake_wide() switches between packing (select_idle_sibling(), llc_size > CPUs) and spreading (find_idlest_cpu(), all CPUs). > > AFAICS, since none of the sched domains set SD_BALANCE_WAKE, currently > all wakeups are (llc-)packed. > > select_task_rq_fair() > > for_each_domain(cpu, tmp) > > if (tmp->flags & sd_flag) > sd = tmp; > > > In case we would like to further distinguish between llc-packing and > even narrower (cluster or MC-L2)-packing, we would introduce a 2. level > packing vs. spreading heuristic further down in sis(). > > IMHO, Barry's current implementation doesn't do this right now. Instead > he's trying to pack on cluster first and if not successful look further > among the remaining llc CPUs for an idle CPU. Right now in the main cases of using wake_affine to achieve better performance, processes are actually bound within one numa which is also a LLC in kunpeng920. Probably LLC=NUMA is also true for X86 Jacobsville, Tim? So one possible way to pretend a 2-level packing might be: if the affinity cpuset of waker and waker are both subset of one same LLC, we totally use cluster as the factor to determine packing or not and ignore LLC. I haven't really done this, but the below code can make the same result by forcing llc_id=cluster_id: diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c index d72eb8d..3d78097 100644 --- a/arch/arm64/kernel/topology.c +++ b/arch/arm64/kernel/topology.c @@ -107,7 +107,7 @@ int __init parse_acpi_topology(void) cpu_topology[cpu].cluster_id = topology_id; topology_id = find_acpi_cpu_topology_package(cpu); cpu_topology[cpu].package_id = topology_id; - +#if 0 i = acpi_find_last_cache_level(cpu); if (i > 0) { @@ -119,8 +119,11 @@ int __init parse_acpi_topology(void) if (cache_id > 0) cpu_topology[cpu].llc_id = cache_id; } - } +#else + cpu_topology[cpu].llc_id = cpu_topology[cpu].cluster_id; +#endif + } return 0; } #endif With this, I have seen some major improvement in hackbench especially for monogamous communication model (fds_num=1, one sender for one receiver): numactl -N 0 hackbench -p -T -l 200000 -f 1 -g $1 I have tested -g(group_nums) 6, 12, 18, 24, 28, 32, For each different g, I ran 20 times and got the average value. The result is as below: g= 6 12 18 24 28 32 w/o 1.3243 1.6741 1.7560 1.9036 2.0262 2.1826 w/ 1.1314 1.1864 1.4494 1.6159 1.9078 2.1249 Using top -H and hit "f" to show cpu of each thread, I am seeing the two threads in one group are likely to run in a cluster. That's why the hackbench latency is decreasing much. Thanks Barry _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel