From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2A073386C25 for ; Fri, 15 May 2026 20:21:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.19 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778876473; cv=none; b=O8apZkP14mFaykZGOIquq17FNbkvDVq+K8BmGHVKCM1Y1DH/ccKWaHlxVHfYEXaR4aXzF9nbIK3ZU3BNY7r7g/AoPPwblg0AAZPTntWdYZhA2FipdNYmuY04YcETBv58C7l0sIv5WODm2obrm7sDqFp3kKaFCjF1x+NJCJie6gA= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778876473; c=relaxed/simple; bh=mNTzgDDiz183TR+OsLx1JmgfKJDhG04FWJuyQNGY2Vg=; h=Message-ID:Subject:From:To:Cc:Date:In-Reply-To:References: Content-Type:MIME-Version; b=s2qkVsPIxdKlnnRVyxof4Qyaik8F2Qlc7r52281q8FQGbhuER0BZ9PQ8GXVPcmFu2HJwJ9Bs8s84HvjKVhZ6XRLfCDp3dDjJV8alQ/2SCKfINy/n0MOhtyifbtJgEd+9E8QqjcpevWre9l2zjJA7YrOONCkjZKwAk0wuSNbchaE= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=pass smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=dAFC5Uwt; arc=none smtp.client-ip=198.175.65.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="dAFC5Uwt" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1778876472; x=1810412472; h=message-id:subject:from:to:cc:date:in-reply-to: references:content-transfer-encoding:mime-version; bh=mNTzgDDiz183TR+OsLx1JmgfKJDhG04FWJuyQNGY2Vg=; b=dAFC5Uwtq6oIuaYPJn3UoXulUH5CdGVd/yDi0otBcmzwq/qQ+TKUFmDa I870T2ydUPPjivXufkradz9dKPQM7e7gY3XgvzOR643sZo1F4pnlfm4b8 QpgEaRljRtgr+DEHIt0QkADcDcRftjSNZCvJYlSIYEawz4EiI7uh+G8UQ f7nOM5ApOfP3KdOuTmzZaKe0/pdvs4YVw/0b56u/cDjGt/4UYm7MEAUg/ eVYslq4H3KzJfsCVq2hxEBsf42s42Baqq4Rcu2lg4BihxBLHs84CnfWat kJdvcaDCuzKKJw5nA1hjgG7bwjj4YSM9cQGwb6C/z60ZT1TxpXNmi2guL A==; X-CSE-ConnectionGUID: 1BKfsT3yQ2eCJRFznoqbzA== X-CSE-MsgGUID: HCJrRPMtT+GbNQH0H+IWHg== X-IronPort-AV: E=McAfee;i="6800,10657,11787"; a="79802839" X-IronPort-AV: E=Sophos;i="6.23,236,1770624000"; d="scan'208";a="79802839" Received: from fmviesa008.fm.intel.com ([10.60.135.148]) by orvoesa111.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 May 2026 13:21:12 -0700 X-CSE-ConnectionGUID: HGhTslmeS76iJ/aUmD/G3g== X-CSE-MsgGUID: NBFSiyTJTuu8DDUH8Pf55A== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.23,236,1770624000"; d="scan'208";a="236145459" Received: from schen9-mobl4.amr.corp.intel.com (HELO [10.125.108.218]) ([10.125.108.218]) by fmviesa008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 May 2026 13:21:10 -0700 Message-ID: Subject: Re: [PATCH v3 4/4] sched/topology: Do not clear SD_PREFER_SIBLING in domains with clusters From: Tim Chen To: Ricardo Neri , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Chen Yu , Christian Loehle , Barry Song Cc: "Rafael J. Wysocki" , Len Brown , ricardo.neri@intel.com, linux-kernel@vger.kernel.org Date: Fri, 15 May 2026 13:21:09 -0700 In-Reply-To: <20260514-rneri-fix-cas-clusters-v3-4-0037869554bd@linux.intel.com> References: <20260514-rneri-fix-cas-clusters-v3-0-0037869554bd@linux.intel.com> <20260514-rneri-fix-cas-clusters-v3-4-0037869554bd@linux.intel.com> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.58.1 (3.58.1-1.fc43) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 On Thu, 2026-05-14 at 11:34 -0700, Ricardo Neri wrote: > Some topologies have scheduling domains that contain CPUs of asymmetric > capacity, grouped into two or more clusters of equal-capacity CPUs > sharing an L2 cache. When CONFIG_SCHED_CLUSTER is enabled, load must be > balanced across these resource-sharing clusters. >=20 > Do not clear SD_PREFER_SIBLING in the child domains to indicate to the > load balancer that it should spread load among cluster siblings. >=20 > Checks for capacity in update_sd_pick_busiest() prevent migrations from > high- to low-capacity CPUs if a candidate group is not overloaded. >=20 > An effect of keeping the SD_PREFER_SIBLING in domains with asymmetric > capacity is that low-capacity clusters with spare capacity can now help > overloaded higher-capacity groups. This was already the case for single-C= PU > groups (see calculate_imbalance() for domains with SD_SHARE_LLC). >=20 > Once the overloading condition disappears, misfit load will still be used > to move high-utilization tasks to bigger CPUs if they have spare capacity= . Looks good to me. Reviewed-by: Tim Chen >=20 > Signed-off-by: Ricardo Neri > --- > Changes in v3: > * Updated documentation of SD_PREFER_SIBLING. > * Expanded the patch description to explain the behavior when overloaded > groups are involved. >=20 > Changes in v2: > * Reworded the patch description for clarity. > * Kept parentheses around bitwise operators for clarity. > --- > include/linux/sched/sd_flags.h | 3 ++- > kernel/sched/topology.c | 14 ++++++++++++-- > 2 files changed, 14 insertions(+), 3 deletions(-) >=20 > diff --git a/include/linux/sched/sd_flags.h b/include/linux/sched/sd_flag= s.h > index 42839cfa2778..42f74af83b8c 100644 > --- a/include/linux/sched/sd_flags.h > +++ b/include/linux/sched/sd_flags.h > @@ -147,7 +147,8 @@ SD_FLAG(SD_ASYM_PACKING, SDF_NEEDS_GROUPS) > * Prefer to place tasks in a sibling domain > * > * Set up until domains start spanning NUMA nodes. Close to being a SHAR= ED_CHILD > - * flag, but cleared below domains with SD_ASYM_CPUCAPACITY. > + * flag, but cleared below domains with SD_ASYM_CPUCAPACITY if the domai= n does > + * not have clusters of CPUs sharing cache. > * > * NEEDS_GROUPS: Load balancing flag. > */ > diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c > index 5847b83d9d55..a1d048344ea1 100644 > --- a/kernel/sched/topology.c > +++ b/kernel/sched/topology.c > @@ -1723,8 +1723,18 @@ sd_init(struct sched_domain_topology_level *tl, > /* > * Convert topological properties into behaviour. > */ > - /* Don't attempt to spread across CPUs of different capacities. */ > - if ((sd->flags & SD_ASYM_CPUCAPACITY) && sd->child) > + /* > + * Don't attempt to spread across CPUs of different capacities. > + * > + * If the domain has clusters of CPUs sharing L2 cache, keep the flag t= o > + * spread tasks across clusters of identical capacity. Checks in > + * update_sd_pick_busiest() prevent task migrations from high- to low- > + * capacity CPUs for non-overloaded groups. Migrations to a lower- > + * capacity CPU can happen if a higher-capacity group is overloaded and > + * a low-capacity cluster has spare capacity. > + */ > + if ((sd->flags & SD_ASYM_CPUCAPACITY) && sd->child && > + !(sd->child->flags & SD_CLUSTER)) > sd->child->flags &=3D ~SD_PREFER_SIBLING; > =20 > if (sd->flags & SD_SHARE_CPUCAPACITY) {