From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6C3BFC433E9 for ; Fri, 8 Jan 2021 21:32:32 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 3090723A9B for ; Fri, 8 Jan 2021 21:32:32 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3090723A9B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=hisilicon.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:MIME-Version:In-Reply-To:References:Message-ID:Date: Subject:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=z9OVgcuSPuiaFwoABlO2nj+iSNsnyy9NUoVo0xm5akE=; b=sRdjVkPMV3yBt+gJX55E/Rrr1 SIUdJ/q030GLnXtT2sobACDe5k01Lww6OeGgNRp/C5c26Ka/Cpq3v0PBvjHERGZa0Tv/mh66yT1CX /Uf3zBEqcfgy6QS01MHAmeV9a7totBvO2P9FhoSOISW55fdAVg3Zz3fMiYg4m+URLDzSNuo/nOk1W A3SDrKlcQpeCSRQx3ij+sqUYtREVj9yeKQl0Ir4uKhv+9c7qBhp3zrhyT4+7h2HI4kMN5HhlXXLKi eunssHkakRGIiZRFzr4Zkvlx6wpmEPXvEoaFIIDjL75voiCyJGPFmU7MhV1Q8UqpjkAGw/U3GGplZ nXdgQO0xA==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1kxzLL-00049L-3D; Fri, 08 Jan 2021 21:30:47 +0000 Received: from frasgout.his.huawei.com ([185.176.79.56]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1kxzLH-00048E-Km for linux-arm-kernel@lists.infradead.org; Fri, 08 Jan 2021 21:30:44 +0000 Received: from fraeml714-chm.china.huawei.com (unknown [172.18.147.201]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4DCGQ64N4hz67XP8; Sat, 9 Jan 2021 05:26:54 +0800 (CST) Received: from lhreml715-chm.china.huawei.com (10.201.108.66) by fraeml714-chm.china.huawei.com (10.206.15.33) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2106.2; Fri, 8 Jan 2021 22:30:37 +0100 Received: from dggemi761-chm.china.huawei.com (10.1.198.147) by lhreml715-chm.china.huawei.com (10.201.108.66) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256) id 15.1.2106.2; Fri, 8 Jan 2021 21:30:36 +0000 Received: from dggemi761-chm.china.huawei.com ([10.9.49.202]) by dggemi761-chm.china.huawei.com ([10.9.49.202]) with mapi id 15.01.2106.002; Sat, 9 Jan 2021 05:30:34 +0800 From: "Song Bao Hua (Barry Song)" To: Morten Rasmussen , Tim Chen Subject: RE: [RFC PATCH v3 0/2] scheduler: expose the topology of clusters and add cluster scheduler Thread-Topic: [RFC PATCH v3 0/2] scheduler: expose the topology of clusters and add cluster scheduler Thread-Index: AQHW5AbteiV+M51D3k+7bMDug+Sb86ocSCyAgAELFICAAOgS8A== Date: Fri, 8 Jan 2021 21:30:34 +0000 Message-ID: References: <20210106083026.40444-1-song.bao.hua@hisilicon.com> <737932c9-846a-0a6b-08b8-e2d2d95b67ce@linux.intel.com> <20210108151241.GA47324@e123083-lin> In-Reply-To: <20210108151241.GA47324@e123083-lin> Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.126.202.246] MIME-Version: 1.0 X-CFilter-Loop: Reflected X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210108_163043_893025_DE15BA4E X-CRM114-Status: GOOD ( 15.38 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "juri.lelli@redhat.com" , "mark.rutland@arm.com" , "peterz@infradead.org" , "catalin.marinas@arm.com" , "bsegall@google.com" , "xuwei \(O\)" , "will@kernel.org" , "vincent.guittot@linaro.org" , "aubrey.li@linux.intel.com" , "linux-acpi@vger.kernel.org" , "mingo@redhat.com" , "mgorman@suse.de" , "valentin.schneider@arm.com" , "lenb@kernel.org" , "linuxarm@openeuler.org" , "rostedt@goodmis.org" , "Zengtao \(B\)" , Jonathan Cameron , "dietmar.eggemann@arm.com" , "linux-arm-kernel@lists.infradead.org" , "gregkh@linuxfoundation.org" , "rjw@rjwysocki.net" , "linux-kernel@vger.kernel.org" , "sudeep.holla@arm.com" , "tiantao \(H\)" Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org > -----Original Message----- > From: Morten Rasmussen [mailto:morten.rasmussen@arm.com] > Sent: Saturday, January 9, 2021 4:13 AM > To: Tim Chen > Cc: Song Bao Hua (Barry Song) ; > valentin.schneider@arm.com; catalin.marinas@arm.com; will@kernel.org; > rjw@rjwysocki.net; vincent.guittot@linaro.org; lenb@kernel.org; > gregkh@linuxfoundation.org; Jonathan Cameron ; > mingo@redhat.com; peterz@infradead.org; juri.lelli@redhat.com; > dietmar.eggemann@arm.com; rostedt@goodmis.org; bsegall@google.com; > mgorman@suse.de; mark.rutland@arm.com; sudeep.holla@arm.com; > aubrey.li@linux.intel.com; linux-arm-kernel@lists.infradead.org; > linux-kernel@vger.kernel.org; linux-acpi@vger.kernel.org; > linuxarm@openeuler.org; xuwei (O) ; Zengtao (B) > ; tiantao (H) > Subject: Re: [RFC PATCH v3 0/2] scheduler: expose the topology of clusters and > add cluster scheduler > > On Thu, Jan 07, 2021 at 03:16:47PM -0800, Tim Chen wrote: > > On 1/6/21 12:30 AM, Barry Song wrote: > > > ARM64 server chip Kunpeng 920 has 6 clusters in each NUMA node, and each > > > cluster has 4 cpus. All clusters share L3 cache data while each cluster > > > has local L3 tag. On the other hand, each cluster will share some > > > internal system bus. This means cache is much more affine inside one cluster > > > than across clusters. > > > > There is a similar need for clustering in x86. Some x86 cores could share > L2 caches that > > is similar to the cluster in Kupeng 920 (e.g. on Jacobsville there are 6 clusters > > of 4 Atom cores, each cluster sharing a separate L2, and 24 cores sharing > L3). > > Having a sched domain at the L2 cluster helps spread load among > > L2 domains. This will reduce L2 cache contention and help with > > performance for low to moderate load scenarios. > > IIUC, you are arguing for the exact opposite behaviour, i.e. balancing > between L2 caches while Barry is after consolidating tasks within the > boundaries of a L3 tag cache. One helps cache utilization, the other > communication latency between tasks. Am I missing something? Morten, this is not true. we are both actually looking for the same behavior. My patch also has done the exact same behavior of spreading with Tim's patch. Considering the below two cases: Case 1. we have two tasks without any relationship running in a system with 2 clusters and 8 cpus. Without the sched_domain of cluster, these two tasks might be put as below: +-------------------+ +-----------------+ | +----+ +----+ | | | | |task| |task| | | | | |1 | |2 | | | | | +----+ +----+ | | | | | | | | cluster1 | | cluster2 | +-------------------+ +-----------------+ With the sched_domain of cluster, load balance will spread them as below: +-------------------+ +-----------------+ | +----+ | | +----+ | | |task| | | |task| | | |1 | | | |2 | | | +----+ | | +----+ | | | | | | cluster1 | | cluster2 | +-------------------+ +-----------------+ Then task1 and tasks2 get more cache and decrease cache contention. They will get better performance. That is what my original patch also can make. And tim's patch is also doing. Once we add a sched_domain, load balance will get involved. Case 2. we have 8 tasks, running in a system with 2 clusters and 8 cpus. But they are working in 4 groups: Task1 wakes up task4 Task2 wakes up task5 Task3 wakes up task6 Task4 wakes up task7 With my changing in select_idle_sibling, the WAKE_AFFINE mechanism will try to put task1 and 4, task2 and 5, task3 and 6, task4 and 7 in same clusters rather than putting all of them in the random one of the 8 cpus. However, the 8 tasks are still spreading among the 8 cpus with my change in select_idle_sibling as load balance is still working. +---------------------------+ +----------------------+ | +----+ +-----+ | | +----+ +-----+ | | |task| |task | | | |task| |task | | | |1 | | 4 | | | |2 | |5 | | | +----+ +-----+ | | +----+ +-----+ | | | | | | cluster1 | | cluster2 | | | | | | | | | | +-----+ +------+ | | +-----+ +------+ | | |task | | task | | | |task | |task | | | |3 | | 6 | | | |4 | |8 | | | +-----+ +------+ | | +-----+ +------+ | +---------------------------+ +----------------------+ Let's consider the 3rd case, that one would be more tricky: task1 and task2 have close relationship and they are waker-wakee pair. With my current patch, select_idle_sidling() wants to put them in one cluster, load balance wants to put them in two clusters. Load balance will win. Then maybe we need some same mechanism like adjusting numa imbalance: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/kernel/sched/fair.c?id=b396f52326de20 if we permit a light imbalance between clusters, select_idle_sidling() will win. And task1 and task2 get better cache affinity. The 3rd case could be our goal for next step. Thanks Barry _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel