From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.20])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id D90C23A1E72
	for <linux-kernel@vger.kernel.org>; Sun, 10 May 2026 16:08:31 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.20
ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1778429314; cv=none; b=AfF2epj384FcJjwW0x6WA9UFxMMWIAexOmdk1GbtOVFZZQjj2Md7SJqnZbu/o2UuSZJwAwebYzUQXJaxAjUTjNllqnvSQMgD2trKnuAErV3yjld6Ef0oJPkzFC+WjHz4b6TcwHdB0WRnesGm4Sb70be7v05bd7V6tX3d1U9TJ74=
ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1778429314; c=relaxed/simple;
	bh=+vCQieBGpRnalEMDpxMlDBEZO3aqRsPWYugZR+DXJbc=;
	h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References:
	 MIME-Version; b=IOZSRbOjdT90R16AQnY+h+5TT6kr3EXhKSVF4XXfwsIw6addhfAOAGY+TQBuS9tXN/EmWIjYUklgYP3oah+G2e48Rpl1wDxcNvTJXPSs9TOmaTfIghdZJcOvPBFKvBcw7t6VPlIrKRcky3E/H7Nl6sG8DmPw4LeX7lPvg2J3yCQ=
ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=J4ab65Nl; arc=none smtp.client-ip=198.175.65.20
Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com
Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="J4ab65Nl"
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple;
  d=intel.com; i=@intel.com; q=dns/txt; s=Intel;
  t=1778429312; x=1809965312;
  h=from:to:cc:subject:date:message-id:in-reply-to:
   references:mime-version:content-transfer-encoding;
  bh=+vCQieBGpRnalEMDpxMlDBEZO3aqRsPWYugZR+DXJbc=;
  b=J4ab65NlOjUEQdviMIQ6zCQSoovWEBs4lfWcDX7mhsfKMbkrzKGxie4V
   VsmY6x5tJlOGbSyCdEopfMBkvavICcBNUdL2HEXLkrNI6i6xXvu+v05lE
   YkksBymvMYUamP7LnbPQE8NPp39jBhZreq/U1bbR+oWsDd3ReGjg7PKCG
   j5mMYQ42767Fd5fLNfnoukEou931kMZptzTca5lbA/5D4wF9LfaD+Wkb5
   aXIG2DfbRLz8SW1qh/JAKnSxMt0LNJvSD74DCmlFcb8rUaSzjZm7jcT1I
   FQcFXBFql0PbHSjF/ZakapvviOpTRhG1Zf71K4DfIuDWtGX+lmOv7/Z99
   w==;
X-CSE-ConnectionGUID: gIixuG+cSymjxq0wv/LKaw==
X-CSE-MsgGUID: g2CNENc6RXm1vlRLomjCQA==
X-IronPort-AV: E=McAfee;i="6800,10657,11782"; a="79056520"
X-IronPort-AV: E=Sophos;i="6.23,227,1770624000"; 
   d="scan'208";a="79056520"
Received: from fmviesa009.fm.intel.com ([10.60.135.149])
  by orvoesa112.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 May 2026 09:08:31 -0700
X-CSE-ConnectionGUID: Grl+yQqxTZCZtLGHAJ9t7g==
X-CSE-MsgGUID: uiAGblv+Qyu6W54WXkx/2g==
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="6.23,227,1770624000"; 
   d="scan'208";a="230851393"
Received: from chenyu-dev.sh.intel.com ([10.239.62.107])
  by fmviesa009.fm.intel.com with ESMTP; 10 May 2026 09:08:30 -0700
From: Chen Yu <yu.c.chen@intel.com>
To: kprateek.nayak@amd.com,
	tim.c.chen@linux.intel.com,
	peterz@infradead.org
Cc: pan.deng@intel.com,
	mingo@kernel.org,
	linux-kernel@vger.kernel.org,
	tianyou.li@intel.com
Subject: Re: [PATCH v2 1/4] sched/rt: Optimize cpupri_vec layout to mitigate cache line contention
Date: Sun, 10 May 2026 23:59:16 +0800
Message-Id: <20260510155920.2587431-1-yu.c.chen@intel.com>
X-Mailer: git-send-email 2.25.1
In-Reply-To: <729726b9-c669-41e2-887d-bdf9da703034@amd.com>
References: <729726b9-c669-41e2-887d-bdf9da703034@amd.com>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit

On Fri, Apr 10, 2026 at 11:32:09AM +0530, K Prateek Nayak wrote:
> Hello Chenyu, Tim,
>
> On 4/10/2026 11:21 AM, Chen, Yu C wrote:
> >>> I think per-LLC mask (or, as Tim suggested, 64CPUs per cacheline) is
> >>> a good tradeoff between the speedup vs amount of loads required to
> >>> piece together the full cpumask. Thoughts?
> > 
> > Yes, making it per LLC should work well enough (for balancing) to
> > achieve optimal benefit. Let me run some similar tests to yours,plus
> > hackbench/schbench, to see what the results are.
> > BTW, on AMD systems, does the TILE domain always match the CCX where
> > L3 is shared? On Intel the DIE is not always mapped to a domain
> > where L3 is shared.
> 
> On AMD platforms that support the extended leaf 0x80000026, CCX is
> always mapped to L3 and matched the data on 0x8000001D cache property
> leaf for L3.
> 
> > >> I agree that per-LLC mask is a good compromise between minimizing loads
> > >> and offer good speed ups.  I think we should get the LLC APICID
> > >> mask from 0x4 leaf (L1, L2, L3) instead of inferring from 0x1f leaf (Tile, Die ...etc)
> > >> for Intel.  And the cache leaf I think is 0x8000_001D leaf for AMD.
> > >> Those are parsed in cacheinfo code and we can get it from there.
> > 
> > Yes, let me check how we can leverage the l3 id for that.
> 
> Ack! I think the cacheinfo is better for all this and is also compatible
> with older systems that may nit have the extend topology enumeration
> leaf. AMD only got it two generations ago and until that only cache
> property leaf was used for marking the LLC (CCX) boundary.

Sorry for the delay. Here are the changes that create sbm leafs based
on cacheinfo. This can be applied on top of Peter's original patches and
Prateek's search optimization. We have not tested it yet, but it aims
to provide an evaluation prototype that prepares for next steps:
nohz idle mask evaluation, converting cpupri_vec->cpumask to per-LLC
granularity, etc. We will start testing nohz mask(if no objection on
this prototype) and share the results later.

thanks,
Chenyu