From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E8595C433EF for ; Sun, 26 Jun 2022 12:15:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:Content-Type: Content-Transfer-Encoding:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:In-Reply-To:From:References:Cc:To:Subject: MIME-Version:Date:Message-ID:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=zhftcErxhCta1Dbdgk2PbUXxI9VryOFqsyUZvl3TYRA=; b=eB2bUN7oQokhXp lflx/OgEOOHdG4/w/QGsvDuHJ0YlVd8sUAivq7mH6MYDQRGLmRngDp3nrRJfHn0Dm0/M2WVi07DBd xotPvmOZAkhwh9uEVCWJrwu9GP0jWMbwto7Xz9GtxDaR2dwvpuw4QrPdUXcAR1pqdkX1VRbh/hRL7 FYJglFEZld4hYrBIGZf5YriQ4Jt1NwPf+dFopFUCoyApNZeL1zhPDs4/D7446PYlNW6faV8iJAIID EDyiRjhjjyGZGRCFWS/U8jAinI4tC+kAnxLJDJs715ZibW29xifup+kPUFMLsrc2F/p1yq1CRq6Hz VwYL19ReBrY8bq2secOQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1o5R9R-00Bdpm-Jt; Sun, 26 Jun 2022 12:14:05 +0000 Received: from mail-pg1-x52e.google.com ([2607:f8b0:4864:20::52e]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1o5R9N-00BdoF-8i for linux-arm-kernel@lists.infradead.org; Sun, 26 Jun 2022 12:14:03 +0000 Received: by mail-pg1-x52e.google.com with SMTP id q140so6599416pgq.6 for ; Sun, 26 Jun 2022 05:14:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=message-id:date:mime-version:user-agent:subject:content-language:to :cc:references:from:in-reply-to:content-transfer-encoding; bh=tr68Ko7MQ6JUiXILlPaoYGFCJ0N9R+PrijE84Purgho=; b=TZhLT36nwtHY/WVuNH7I5CsjpgXfm4frSU1iHJL/Asak5I4oMdKWsf94H8G7H+F5GP PNVvJHNn3sMLVCgFISw4Ikj/LJAj7/OTcwMzLebdF8UwFHx74edT+gLeFcyj1PqJE0OZ Efwguxvu7wFK4ndc3XYnzAzE9LWVTYc2S9pX5rJRezfv7+HPbxmwWZI4RPfo4qolV3Fr CUuJ3yjT5bwq2hMnmZRviOhuU3wXM7VjguoCn177BhzgN0nxSp/VIa7IFxKoCHGSHnMG z7o9QoFx+91mWBRlZE349xVHjtUC02mDtgsRn2SnN83o2YGu27gfCNl9tCooT8mw6vR4 RIOA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:to:cc:references:from:in-reply-to :content-transfer-encoding; bh=tr68Ko7MQ6JUiXILlPaoYGFCJ0N9R+PrijE84Purgho=; b=WnQiCpo+q3zk17uxIVK/1NRTDd745ZHkuPgmROw8xaEr4LZ2tYkJFFhN+IV8CFmajd pTOmef3653uNiHGPulK1zlOY+znVtq8ma2u8nVf+qKfqE/QBC9rajEjLqGqqeWRnNCbr 5yRxSd9IyzOXSqx9bllWoJakcFcHlQqpjateUQzpe2kJt20KSIL75A0IHR50/NlUEdHl mF8k70lQbL2PAuhvW66vpN8s4Lv+xd4UCcplREDeowwIAYYvihYB8zt1EDJ/GmdoQnlS 2H/hWwjg2QHAkW+LXaWgLgg0QcaepEaExPQasxLwuPv+HPDaNG2XGDnEbv8LsMnps2E9 ARfQ== X-Gm-Message-State: AJIora8rlvmFZLftf8yHYQdPZqt3GIqPJQGS+tbMhjmP4OlThYcbR4dP pj61mfvUFk0uHq84CHjVMa6vZQ== X-Google-Smtp-Source: AGRyM1ulaCu5x4K9J0xYmasjXbqgcmxlk4TjCjMtWzuoOWSuYjU+znuodThM2hmpigRy20rCoyVd1w== X-Received: by 2002:aa7:9298:0:b0:525:a210:465f with SMTP id j24-20020aa79298000000b00525a210465fmr3581210pfa.77.1656245639499; Sun, 26 Jun 2022 05:13:59 -0700 (PDT) Received: from [10.4.126.80] ([139.177.225.248]) by smtp.gmail.com with ESMTPSA id a23-20020a17090a6d9700b001eccb13dfb0sm7366645pjk.4.2022.06.26.05.13.51 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 26 Jun 2022 05:13:58 -0700 (PDT) Message-ID: <9cf43b10-85a2-1a83-057f-c43be339265e@bytedance.com> Date: Sun, 26 Jun 2022 20:13:49 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.10.0 Subject: Re: [PATCH v4 2/2] sched/fair: Scan cluster before scanning LLC in wake-up path Content-Language: en-US To: Yicong Yang , peterz@infradead.org, mingo@redhat.com, juri.lelli@redhat.com, vincent.guittot@linaro.org, tim.c.chen@linux.intel.com, gautham.shenoy@amd.com, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, bristot@redhat.com, prime.zeng@huawei.com, jonathan.cameron@huawei.com, ego@linux.vnet.ibm.com, srikar@linux.vnet.ibm.com, linuxarm@huawei.com, 21cnbao@gmail.com, guodong.xu@linaro.org, hesham.almatary@huawei.com, john.garry@huawei.com, shenyang39@huawei.com References: <20220609120622.47724-1-yangyicong@hisilicon.com> <20220609120622.47724-3-yangyicong@hisilicon.com> From: Abel Wu In-Reply-To: <20220609120622.47724-3-yangyicong@hisilicon.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220626_051401_605107_73337CB3 X-CRM114-Status: GOOD ( 26.37 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On 6/9/22 8:06 PM, Yicong Yang Wrote: > From: Barry Song > > For platforms having clusters like Kunpeng920, CPUs within the same cluster > have lower latency when synchronizing and accessing shared resources like > cache. Thus, this patch tries to find an idle cpu within the cluster of the > target CPU before scanning the whole LLC to gain lower latency. > > Note neither Kunpeng920 nor x86 Jacobsville supports SMT, so this patch > doesn't consider SMT for this moment. > > Testing has been done on Kunpeng920 by pinning tasks to one numa and two > numa. On Kunpeng920, Each numa has 8 clusters and each cluster has 4 CPUs. > > With this patch, We noticed enhancement on tbench within one numa or cross > two numa. > > On numa 0: > 5.19-rc1 patched > Hmean 1 350.27 ( 0.00%) 406.88 * 16.16%* > Hmean 2 702.01 ( 0.00%) 808.22 * 15.13%* > Hmean 4 1405.14 ( 0.00%) 1614.34 * 14.89%* > Hmean 8 2830.53 ( 0.00%) 3169.02 * 11.96%* > Hmean 16 5597.95 ( 0.00%) 6224.20 * 11.19%* > Hmean 32 10537.38 ( 0.00%) 10524.97 * -0.12%* > Hmean 64 8366.04 ( 0.00%) 8437.41 * 0.85%* > Hmean 128 7060.87 ( 0.00%) 7150.25 * 1.27%* > > On numa 0-1: > 5.19-rc1 patched > Hmean 1 346.11 ( 0.00%) 408.47 * 18.02%* > Hmean 2 693.34 ( 0.00%) 805.78 * 16.22%* > Hmean 4 1384.96 ( 0.00%) 1602.49 * 15.71%* > Hmean 8 2699.45 ( 0.00%) 3069.98 * 13.73%* > Hmean 16 5327.11 ( 0.00%) 5688.19 * 6.78%* > Hmean 32 10019.10 ( 0.00%) 11862.56 * 18.40%* > Hmean 64 13850.57 ( 0.00%) 17748.54 * 28.14%* > Hmean 128 12498.25 ( 0.00%) 15541.59 * 24.35%* > Hmean 256 11195.77 ( 0.00%) 13854.06 * 23.74%* > > Tested-by: Yicong Yang > Signed-off-by: Barry Song > Signed-off-by: Yicong Yang > --- > kernel/sched/fair.c | 44 +++++++++++++++++++++++++++++++++++++++++--- > 1 file changed, 41 insertions(+), 3 deletions(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 77b2048a9326..6d173e196ad3 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -6327,6 +6327,40 @@ static inline int select_idle_smt(struct task_struct *p, struct sched_domain *sd > > #endif /* CONFIG_SCHED_SMT */ > > +#ifdef CONFIG_SCHED_CLUSTER > +/* > + * Scan the cluster domain for idle CPUs and clear cluster cpumask after scanning > + */ > +static inline int scan_cluster(struct task_struct *p, struct cpumask *cpus, > + int target, int *nr) > +{ > + struct sched_domain *sd = rcu_dereference(per_cpu(sd_cluster, target)); > + int cpu, idle_cpu; > + > + /* TODO: Support SMT system with cluster topology */ > + if (!sched_smt_active() && sd) { > + for_each_cpu_and(cpu, cpus, sched_domain_span(sd)) { > + if (!--*nr) > + break; return -1; :) > + > + idle_cpu = __select_idle_cpu(cpu, p); > + if ((unsigned int)idle_cpu < nr_cpumask_bits) > + return idle_cpu; > + } > + > + cpumask_andnot(cpus, cpus, sched_domain_span(sd)); > + } > + > + return -1; > +} > +#else > +static inline int scan_cluster(struct task_struct *p, struct cpumask *cpus, > + int target, int *nr) > +{ > + return -1; > +} > +#endif > + > /* > * Scan the LLC domain for idle CPUs; this is dynamically regulated by > * comparing the average scan cost (tracked in sd->avg_scan_cost) against the > @@ -6375,6 +6409,10 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, bool > time = cpu_clock(this); > } > > + idle_cpu = scan_cluster(p, cpus, target, &nr); > + if ((unsigned int)idle_cpu < nr_cpumask_bits) > + return idle_cpu; > + > for_each_cpu_wrap(cpu, cpus, target + 1) { > if (has_idle_core) { > i = select_idle_core(p, cpu, cpus, &idle_cpu); > @@ -6382,7 +6420,7 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, bool > return i; > > } else { > - if (!--nr) > + if (--nr <= 0) > return -1; > idle_cpu = __select_idle_cpu(cpu, p); > if ((unsigned int)idle_cpu < nr_cpumask_bits) > @@ -6481,7 +6519,7 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target) > /* > * If the previous CPU is cache affine and idle, don't be stupid: > */ > - if (prev != target && cpus_share_cache(prev, target) && > + if (prev != target && cpus_share_resources(prev, target) && > (available_idle_cpu(prev) || sched_idle_cpu(prev)) && > asym_fits_capacity(task_util, prev)) > return prev; > @@ -6507,7 +6545,7 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target) > p->recent_used_cpu = prev; > if (recent_used_cpu != prev && > recent_used_cpu != target && > - cpus_share_cache(recent_used_cpu, target) && > + cpus_share_resources(recent_used_cpu, target) && > (available_idle_cpu(recent_used_cpu) || sched_idle_cpu(recent_used_cpu)) && > cpumask_test_cpu(p->recent_used_cpu, p->cpus_ptr) && > asym_fits_capacity(task_util, recent_used_cpu)) { _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel