From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.20]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D90C23A1E72 for ; Sun, 10 May 2026 16:08:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.20 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778429314; cv=none; b=AfF2epj384FcJjwW0x6WA9UFxMMWIAexOmdk1GbtOVFZZQjj2Md7SJqnZbu/o2UuSZJwAwebYzUQXJaxAjUTjNllqnvSQMgD2trKnuAErV3yjld6Ef0oJPkzFC+WjHz4b6TcwHdB0WRnesGm4Sb70be7v05bd7V6tX3d1U9TJ74= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778429314; c=relaxed/simple; bh=+vCQieBGpRnalEMDpxMlDBEZO3aqRsPWYugZR+DXJbc=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=IOZSRbOjdT90R16AQnY+h+5TT6kr3EXhKSVF4XXfwsIw6addhfAOAGY+TQBuS9tXN/EmWIjYUklgYP3oah+G2e48Rpl1wDxcNvTJXPSs9TOmaTfIghdZJcOvPBFKvBcw7t6VPlIrKRcky3E/H7Nl6sG8DmPw4LeX7lPvg2J3yCQ= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=J4ab65Nl; arc=none smtp.client-ip=198.175.65.20 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="J4ab65Nl" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1778429312; x=1809965312; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=+vCQieBGpRnalEMDpxMlDBEZO3aqRsPWYugZR+DXJbc=; b=J4ab65NlOjUEQdviMIQ6zCQSoovWEBs4lfWcDX7mhsfKMbkrzKGxie4V VsmY6x5tJlOGbSyCdEopfMBkvavICcBNUdL2HEXLkrNI6i6xXvu+v05lE YkksBymvMYUamP7LnbPQE8NPp39jBhZreq/U1bbR+oWsDd3ReGjg7PKCG j5mMYQ42767Fd5fLNfnoukEou931kMZptzTca5lbA/5D4wF9LfaD+Wkb5 aXIG2DfbRLz8SW1qh/JAKnSxMt0LNJvSD74DCmlFcb8rUaSzjZm7jcT1I FQcFXBFql0PbHSjF/ZakapvviOpTRhG1Zf71K4DfIuDWtGX+lmOv7/Z99 w==; X-CSE-ConnectionGUID: gIixuG+cSymjxq0wv/LKaw== X-CSE-MsgGUID: g2CNENc6RXm1vlRLomjCQA== X-IronPort-AV: E=McAfee;i="6800,10657,11782"; a="79056520" X-IronPort-AV: E=Sophos;i="6.23,227,1770624000"; d="scan'208";a="79056520" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa112.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 May 2026 09:08:31 -0700 X-CSE-ConnectionGUID: Grl+yQqxTZCZtLGHAJ9t7g== X-CSE-MsgGUID: uiAGblv+Qyu6W54WXkx/2g== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.23,227,1770624000"; d="scan'208";a="230851393" Received: from chenyu-dev.sh.intel.com ([10.239.62.107]) by fmviesa009.fm.intel.com with ESMTP; 10 May 2026 09:08:30 -0700 From: Chen Yu To: kprateek.nayak@amd.com, tim.c.chen@linux.intel.com, peterz@infradead.org Cc: pan.deng@intel.com, mingo@kernel.org, linux-kernel@vger.kernel.org, tianyou.li@intel.com Subject: Re: [PATCH v2 1/4] sched/rt: Optimize cpupri_vec layout to mitigate cache line contention Date: Sun, 10 May 2026 23:59:16 +0800 Message-Id: <20260510155920.2587431-1-yu.c.chen@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <729726b9-c669-41e2-887d-bdf9da703034@amd.com> References: <729726b9-c669-41e2-887d-bdf9da703034@amd.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit On Fri, Apr 10, 2026 at 11:32:09AM +0530, K Prateek Nayak wrote: > Hello Chenyu, Tim, > > On 4/10/2026 11:21 AM, Chen, Yu C wrote: > >>> I think per-LLC mask (or, as Tim suggested, 64CPUs per cacheline) is > >>> a good tradeoff between the speedup vs amount of loads required to > >>> piece together the full cpumask. Thoughts? > > > > Yes, making it per LLC should work well enough (for balancing) to > > achieve optimal benefit. Let me run some similar tests to yours,plus > > hackbench/schbench, to see what the results are. > > BTW, on AMD systems, does the TILE domain always match the CCX where > > L3 is shared? On Intel the DIE is not always mapped to a domain > > where L3 is shared. > > On AMD platforms that support the extended leaf 0x80000026, CCX is > always mapped to L3 and matched the data on 0x8000001D cache property > leaf for L3. > > > >> I agree that per-LLC mask is a good compromise between minimizing loads > > >> and offer good speed ups. I think we should get the LLC APICID > > >> mask from 0x4 leaf (L1, L2, L3) instead of inferring from 0x1f leaf (Tile, Die ...etc) > > >> for Intel. And the cache leaf I think is 0x8000_001D leaf for AMD. > > >> Those are parsed in cacheinfo code and we can get it from there. > > > > Yes, let me check how we can leverage the l3 id for that. > > Ack! I think the cacheinfo is better for all this and is also compatible > with older systems that may nit have the extend topology enumeration > leaf. AMD only got it two generations ago and until that only cache > property leaf was used for marking the LLC (CCX) boundary. Sorry for the delay. Here are the changes that create sbm leafs based on cacheinfo. This can be applied on top of Peter's original patches and Prateek's search optimization. We have not tested it yet, but it aims to provide an evaluation prototype that prepares for next steps: nohz idle mask evaluation, converting cpupri_vec->cpumask to per-LLC granularity, etc. We will start testing nohz mask(if no objection on this prototype) and share the results later. thanks, Chenyu