From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F1252EB64DD for ; Tue, 27 Jun 2023 15:19:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232329AbjF0PTV (ORCPT ); Tue, 27 Jun 2023 11:19:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41930 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232292AbjF0PTT (ORCPT ); Tue, 27 Jun 2023 11:19:19 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 36F3E26B3; Tue, 27 Jun 2023 08:19:17 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id DF6492F4; Tue, 27 Jun 2023 08:20:00 -0700 (PDT) Received: from localhost (ionvoi01-desktop.cambridge.arm.com [10.2.78.69]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 67B873F663; Tue, 27 Jun 2023 08:19:16 -0700 (PDT) Date: Tue, 27 Jun 2023 16:19:14 +0100 From: Ionela Voinescu To: Ricardo Neri Cc: "Peter Zijlstra (Intel)" , Juri Lelli , Vincent Guittot , Ricardo Neri , "Ravi V. Shankar" , Ben Segall , Daniel Bristot de Oliveira , Dietmar Eggemann , Len Brown , Mel Gorman , "Rafael J. Wysocki" , Srinivas Pandruvada , Steven Rostedt , Tim Chen , Valentin Schneider , Lukasz Luba , Zhao Liu , "Yuan, Perry" , x86@kernel.org, "Joel Fernandes (Google)" , linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, "Tim C . Chen" , Zhao Liu Subject: Re: [PATCH v4 07/24] sched/fair: Compute IPC class scores for load balancing Message-ID: References: <20230613042422.5344-1-ricardo.neri-calderon@linux.intel.com> <20230613042422.5344-8-ricardo.neri-calderon@linux.intel.com> <20230625201155.GA3902@ranerica-svr.sc.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230625201155.GA3902@ranerica-svr.sc.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hey, On Sunday 25 Jun 2023 at 13:11:55 (-0700), Ricardo Neri wrote: > On Thu, Jun 22, 2023 at 10:02:44AM +0100, Ionela Voinescu wrote: > > On Monday 12 Jun 2023 at 21:24:05 (-0700), Ricardo Neri wrote: > > > When using IPCC scores to break ties between two scheduling groups, it is > > > necessary to consider both the current score and the score that would > > > result after load balancing. > > > > > > Compute the combined IPC class score of a scheduling group and the local > > > scheduling group. Compute both the current score and the prospective score. > > > > > > Collect IPCC statistics only for asym_packing and fully_busy scheduling > > > groups. These are the only cases that use IPCC scores. > > > > > > These IPCC statistics are used during idle load balancing. The candidate > > > scheduling group will have one fewer busy CPU after load balancing. This > > > observation is important for cores with SMT support. > > > > > > The IPCC score of scheduling groups composed of SMT siblings needs to > > > consider that the siblings share CPU resources. When computing the total > > > IPCC score of the scheduling group, divide the score of each sibling by > > > the number of busy siblings. > > > > > > Cc: Ben Segall > > > Cc: Daniel Bristot de Oliveira > > > Cc: Dietmar Eggemann > > > Cc: Ionela Voinescu > > > Cc: Joel Fernandes (Google) > > > Cc: Len Brown > > > Cc: Lukasz Luba > > > Cc: Mel Gorman > > > Cc: Perry Yuan > > > Cc: Rafael J. Wysocki > > > Cc: Srinivas Pandruvada > > > Cc: Steven Rostedt > > > Cc: Tim C. Chen > > > Cc: Valentin Schneider > > > Cc: Zhao Liu > > > Cc: x86@kernel.org > > > Cc: linux-pm@vger.kernel.org > > > Cc: linux-kernel@vger.kernel.org > > > Signed-off-by: Ricardo Neri > > > --- > > > Changes since v3: > > > * None > > > > > > Changes since v2: > > > * Also collect IPCC stats for fully_busy sched groups. > > > * Restrict use of IPCC stats to SD_ASYM_PACKING. (Ionela) > > > * Handle errors of arch_get_ipcc_score(). (Ionela) > > > > > > Changes since v1: > > > * Implemented cleanups and reworks from PeterZ. I took all his > > > suggestions, except the computation of the IPC score before and after > > > load balancing. We are computing not the average score, but the *total*. > > > * Check for the SD_SHARE_CPUCAPACITY to compute the throughput of the SMT > > > siblings of a physical core. > > > * Used the new interface names. > > > * Reworded commit message for clarity. > > > --- > > > kernel/sched/fair.c | 68 +++++++++++++++++++++++++++++++++++++++++++++ > > > 1 file changed, 68 insertions(+) > > > > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > > index c0cab5e501b6..a51c65c9335f 100644 > > > --- a/kernel/sched/fair.c > > > +++ b/kernel/sched/fair.c > > > @@ -9114,6 +9114,8 @@ struct sg_lb_stats { > > > unsigned long min_score; /* Min(score(rq->curr->ipcc)) */ > > > unsigned short min_ipcc; /* Class of the task with the minimum IPCC score in the rq */ > > > unsigned long sum_score; /* Sum(score(rq->curr->ipcc)) */ > > > + long ipcc_score_after; /* Prospective IPCC score after load balancing */ > > > + unsigned long ipcc_score_before; /* IPCC score before load balancing */ > > > #endif > > > }; > > > > > > @@ -9452,6 +9454,62 @@ static void update_sg_lb_ipcc_stats(int dst_cpu, struct sg_lb_stats *sgs, > > > } > > > } > > > > > > +static void update_sg_lb_stats_scores(struct sg_lb_stats *sgs, > > > + struct sched_group *sg, > > > + struct lb_env *env) > > > +{ > > > + unsigned long score_on_dst_cpu, before; > > > + int busy_cpus; > > > + long after; > > > + > > > + if (!sched_ipcc_enabled()) > > > + return; > > > + > > > + /* > > > + * IPCC scores are only useful during idle load balancing. For now, > > > + * only asym_packing uses IPCC scores. > > > + */ > > > + if (!(env->sd->flags & SD_ASYM_PACKING) || > > > + env->idle == CPU_NOT_IDLE) > > > + return; > > > + > > > + /* > > > + * IPCC scores are used to break ties only between these types of > > > + * groups. > > > + */ > > > + if (sgs->group_type != group_fully_busy && > > > + sgs->group_type != group_asym_packing) > > > + return; > > > + > > > + busy_cpus = sgs->group_weight - sgs->idle_cpus; > > > + > > > + /* No busy CPUs in the group. No tasks to move. */ > > > + if (!busy_cpus) > > > + return; > > > + > > > + score_on_dst_cpu = arch_get_ipcc_score(sgs->min_ipcc, env->dst_cpu); > > > + > > > + /* > > > + * Do not use IPC scores. sgs::ipcc_score_{after, before} will be zero > > > + * and not used. > > > + */ > > > + if (IS_ERR_VALUE(score_on_dst_cpu)) > > > + return; > > > + > > > + before = sgs->sum_score; > > > + after = before - sgs->min_score; > > > > I don't believe this can end up being negative as the sum of all > > scores should be higher or equal to the min score, right? > > Yes, I agree. `after` cannot be negative. > > > > > I'm just wondering if ipcc_score_after can be made unsigned long as well, > > just for consistency. > > Sure. I can make it of type unsigned long as well. > > > > > > + > > > + /* SMT siblings share throughput. */ > > > + if (busy_cpus > 1 && sg->flags & SD_SHARE_CPUCAPACITY) { > > > + before /= busy_cpus; > > > + /* One sibling will become idle after load balance. */ > > > + after /= busy_cpus - 1; > > > + } > > > + > > > + sgs->ipcc_score_after = after + score_on_dst_cpu; > > > + sgs->ipcc_score_before = before; > > > > Shouldn't the score_on_dst_cpu be added to "after" before being divided > > between the SMT siblings? > > No, because ipcc_score_after represents the joint score of the busiest > core and the destination core after load balance has taken place. The > destination core was previously idle and now contributes to the joint > score. > Right! score_on_dst_cpu does not contribute to the per-cpu throughput of the busiest core, but it reflects the improvement in score gained by the move to the destination. Thanks, Ionela. > Thanks and BR, > Ricardo