From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C59E23AE6FB for ; Wed, 29 Apr 2026 21:21:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.11 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777497709; cv=none; b=A9n0dio3aKcx4A16VUltCF5HGPT4NSNB5BAta7QAtTy8e0gMKyiT/RJfy1MX4lwrUpUMfl6ZcyLe45TrmGeBum7QWmYc9XbhhbKhQW5siUiIm/eZ7wPGaleROsHNYYzezNkrAl/VKpAz/7g4I9teaa/frd7L075yCy1pHNGC9/w= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777497709; c=relaxed/simple; bh=KOLfANa1cv8HB7c3O7D+HVteEa95bW0wXp59YXDxhBI=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=IHjAGA3Abk5N5ThmvSMPHL+F5Mw+MIYNQNZ+KT4I8dB27AbDApkOgqVSF3AmObhGsOMVgnTSDbX4sxVM5IP1zQxGV4YSiqpsAiYM0WdJH4TKEppWJF1XiFsAhVmwDimTlVUPwY+la40gjfk204ycGlXygh2n3ckNoHMZtL+jIZs= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=pass smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=UVwus8xr; arc=none smtp.client-ip=198.175.65.11 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="UVwus8xr" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1777497708; x=1809033708; h=from:date:subject:mime-version:content-transfer-encoding: message-id:references:in-reply-to:to:cc; bh=KOLfANa1cv8HB7c3O7D+HVteEa95bW0wXp59YXDxhBI=; b=UVwus8xrZEQa9YKAQqC4y3JAIYKuUgAsBTr8bmtp2Yn4or8bedlFFsoF C6U8zS4lm/tA3VFsPDujSfFHDBD5r/rxa5OuMGOGBECGr0heqkt4CQLX3 FDfcS0cbauIgSKvUXrs+nee6pWiq1WWBAdPVd6LXJ4xuUtcwdcUw/x2ay m0KpzpOw60ET2A7+oqaaZ9NFl/PlJM2M2EMCN5aBiY1e5Ja13hVpYTCsd p/vq+b6+8jAuLI/MAacZnaqqCmxIR83DzAUhoMw/eAYFZOQXFK3jb4tjn RSsM8RGringnZlt8ebVJy7OQc260lE+4EDk4DBczF5xXFZkKAwWzBm/jP g==; X-CSE-ConnectionGUID: tbqij4K/ToG/y0deNuZDmg== X-CSE-MsgGUID: NsevcDZ6TYyUuaHd6FwoDA== X-IronPort-AV: E=McAfee;i="6800,10657,11771"; a="88748734" X-IronPort-AV: E=Sophos;i="6.23,206,1770624000"; d="scan'208";a="88748734" Received: from orviesa008.jf.intel.com ([10.64.159.148]) by orvoesa103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Apr 2026 14:21:44 -0700 X-CSE-ConnectionGUID: p9N1QvrLS/+N25WG6bsXzg== X-CSE-MsgGUID: 5SlK89CpQ7iStyDDrIxMIA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.23,206,1770624000"; d="scan'208";a="234260025" Received: from unknown (HELO [172.25.112.21]) ([172.25.112.21]) by orviesa008.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Apr 2026 14:21:44 -0700 From: Ricardo Neri Date: Wed, 29 Apr 2026 14:19:44 -0700 Subject: [PATCH v2 1/4] sched/fair: Check CPU capacity before comparing group types during load balance Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Message-Id: <20260429-rneri-fix-cas-clusters-v2-1-cd787de35cc6@linux.intel.com> References: <20260429-rneri-fix-cas-clusters-v2-0-cd787de35cc6@linux.intel.com> In-Reply-To: <20260429-rneri-fix-cas-clusters-v2-0-cd787de35cc6@linux.intel.com> To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Tim C Chen , Chen Yu , Christian Loehle , Barry Song Cc: "Rafael J. Wysocki" , Len Brown , ricardo.neri@intel.com, linux-kernel@vger.kernel.org, Ricardo Neri X-Mailer: b4 0.13.0 X-Developer-Signature: v=1; a=ed25519-sha256; t=1777497633; l=2602; i=ricardo.neri-calderon@linux.intel.com; s=20250602; h=from:subject:message-id; bh=KOLfANa1cv8HB7c3O7D+HVteEa95bW0wXp59YXDxhBI=; b=oOci730FgZeICJvOGAwT/2o23TImg5FhO0WoTAVM6NQmioiM0RBABJ4sj3CQV91XsqzlbS1lt jRc94ajq5T9Dv19POsjCC5+mLeVsbk8TGa6cRdAE0iFIXgDCfgeNVYP X-Developer-Key: i=ricardo.neri-calderon@linux.intel.com; a=ed25519; pk=NfZw5SyQ2lxVfmNMaMR6KUj3+0OhcwDPyRzFDH9gY2w= update_sd_pick_busiest() may incorrectly select a fully_busy group as the busiest group when its per-CPU capacity exceeds that of the destination CPU. This happens because the type of busiest group is initialized to group_has_spare and allows the fully_busy group to win the type comparison. update_sd_pick_busiest() should not choose a candidate scheduling group with at most one runnable task if its per-CPU capacity is greater than that of the destination CPU. Such a check already exists, but it is done too late: after the type comparison, preventing a subsequent fully_busy group of equal per-CPU capacity from being correctly selected. Move this check to occur before comparing group types. Signed-off-by: Ricardo Neri --- Changes since v1: * Added a note clarifying that SMT and SD_ASYM_CPUCAPACITY are mutually exclusive. (Tim) * Kept parentheses around bitwise operators for clarity. * Rewrote patch description for clarity. --- kernel/sched/fair.c | 25 ++++++++++++++----------- 1 file changed, 14 insertions(+), 11 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 728965851842..0dbed82aa63f 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -10788,6 +10788,20 @@ static bool update_sd_pick_busiest(struct lb_env *env, sds->local_stat.group_type != group_has_spare)) return false; + /* + * Candidate sg has no more than one task per CPU and has higher + * per-CPU capacity. Migrating tasks to less capable CPUs may harm + * throughput. Maximize throughput, power/energy consequences are not + * considered. + * + * Systems with SMT are unaffected, as asymmetric capacity is not set + * in such case. + */ + if ((env->sd->flags & SD_ASYM_CPUCAPACITY) && + (sgs->group_type <= group_fully_busy) && + (capacity_greater(sg->sgc->min_capacity, capacity_of(env->dst_cpu)))) + return false; + if (sgs->group_type > busiest->group_type) return true; @@ -10890,17 +10904,6 @@ static bool update_sd_pick_busiest(struct lb_env *env, break; } - /* - * Candidate sg has no more than one task per CPU and has higher - * per-CPU capacity. Migrating tasks to less capable CPUs may harm - * throughput. Maximize throughput, power/energy consequences are not - * considered. - */ - if ((env->sd->flags & SD_ASYM_CPUCAPACITY) && - (sgs->group_type <= group_fully_busy) && - (capacity_greater(sg->sgc->min_capacity, capacity_of(env->dst_cpu)))) - return false; - return true; } -- 2.43.0