From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B6EAE3AB262 for ; Wed, 8 Apr 2026 16:47:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.16 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775666850; cv=none; b=UtceD744ZEJHI9LT2Puc4gxFCY64CtH+fQ8lERqw5tUWHUJNl0/+KmMs774x+b3K6aeVQHocylcPu8i0rD2lLmPGXsyS9SKZI1p2L9NBzW8RpLdGIbR9H4nn0zyEyW6aCf2sSlk0bJG1GVi+757nVwnhom92EvkLb5PlURHD130= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775666850; c=relaxed/simple; bh=QAxHZa0p4pXevyd0CH1aosqwNis0XgZE0xzb667IlJg=; h=Message-ID:Subject:From:To:Cc:Date:In-Reply-To:References: Content-Type:MIME-Version; b=AmRaE/UmyJo6MAv1JcovGRuIm5jQpXvt68CvblYYOVmaTOdbQJPZ6YVhp7K3T5kUPfegXjzJYP4beSSs2HOGg3HKsBPyQVVQvJohCEwRfbQ3gQ6WXgRS2oNAJ4Fwx3Xw9p5FCRLyJE8f7YFNjsDhn5iOzuuwO0/6f3vMt0G0bro= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=pass smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Hi/ejM63; arc=none smtp.client-ip=192.198.163.16 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Hi/ejM63" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1775666849; x=1807202849; h=message-id:subject:from:to:cc:date:in-reply-to: references:content-transfer-encoding:mime-version; bh=QAxHZa0p4pXevyd0CH1aosqwNis0XgZE0xzb667IlJg=; b=Hi/ejM63+80DlSGVXH0EBZe7XKyo3bNnxz1TNlIePKpM7NOdAi+zOSLe 05qzW62lfdd5ucMn8JZP5G7sqLWfc4CuvDWG1uND6x486Z12a3A7uKDqo DuOoAeGQ6SnyCGl3DqFcU0xMfDkzNtb6gjq+LJfphtdB7H7j93D9kc+5i /qMTAzlQUJ8ztGkgOT5JlGn/5GJqiIrPAe2ZCKxoZmSbG+G7ZKDOy6Cul 4+lT/nr7fn1yU+JVafdidZ3Sxa4cn3wKxoXlYmLnCp3OJ6nMLs5g6ltRE dDzk1Rc67c0CLs9EH8C39R4geCbGr7m36iLo1jrKSa3C6AHl4hFgGsvF5 Q==; X-CSE-ConnectionGUID: LAdm4NqgQbeIJGtv6pDviw== X-CSE-MsgGUID: eAi65fygTbuF/PCY2AubMQ== X-IronPort-AV: E=McAfee;i="6800,10657,11753"; a="64201776" X-IronPort-AV: E=Sophos;i="6.23,167,1770624000"; d="scan'208";a="64201776" Received: from fmviesa007.fm.intel.com ([10.60.135.147]) by fmvoesa110.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Apr 2026 09:47:28 -0700 X-CSE-ConnectionGUID: II0H1PZMTSKr1EnCr+pigQ== X-CSE-MsgGUID: Ch0BBWM8RfuEGmHygdgSVQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.23,167,1770624000"; d="scan'208";a="225348108" Received: from unknown (HELO [10.241.243.39]) ([10.241.243.39]) by fmviesa007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Apr 2026 09:47:28 -0700 Message-ID: <19b2e22bd44d9f10a4960d5f1c4609e78fee73ba.camel@linux.intel.com> Subject: Re: [PATCH v2 4/4] sched/rt: Split cpupri_vec->cpumask to per NUMA node to reduce contention From: Tim Chen To: "Chen, Yu C" Cc: Pan Deng , mingo@kernel.org, linux-kernel@vger.kernel.org, tianyou.li@intel.com, K Prateek Nayak , Peter Zijlstra Date: Wed, 08 Apr 2026 09:47:18 -0700 In-Reply-To: <3a146435-7f5a-40e1-9e63-b9bb7494faf1@intel.com> References: <20260320124003.GU3738786@noisy.programming.kicks-ass.net> <63a095f02428700a7ff2623b8ea81e524a406834.camel@linux.intel.com> <20260324120008.GB3738010@noisy.programming.kicks-ass.net> <138c3f9d-309f-41e6-aa72-a3f6bd713bf0@intel.com> <22072ef8-5aec-49ac-9cc4-8a80bec14261@amd.com> <64649c85-29ab-4f70-a0c4-3c83cbdae2fc@intel.com> <20260402105530.GA3738786@noisy.programming.kicks-ass.net> <93d7eb33-c3a5-4498-bc26-57806b73d9e0@amd.com> <3b66e8e8-07e0-4f3e-a3ba-d97133af5162@intel.com> <1c742a1d8ecd8e314d704d46a44e2b8893479e50.camel@linux.intel.com> <3a146435-7f5a-40e1-9e63-b9bb7494faf1@intel.com> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.58.1 (3.58.1-1.fc43) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 On Wed, 2026-04-08 at 17:25 +0800, Chen, Yu C wrote: > On 4/8/2026 4:35 AM, Tim Chen wrote: > > On Fri, 2026-04-03 at 13:46 +0800, Chen, Yu C wrote: > > > On 4/2/2026 7:06 PM, K Prateek Nayak wrote: > > > > Hello Peter, > > > >=20 > > > > On 4/2/2026 4:25 PM, Peter Zijlstra wrote: > > > > > On Thu, Apr 02, 2026 at 10:11:11AM +0530, K Prateek Nayak wrote: > > > > >=20 > > > > > > It is still not super clear to me how the logic deals with more= than > > > > > > 128CPUs in a DIE domain because that'll need more than the u64 = but > > > > > > sbm_find_next_bit() simply does: > > > > > >=20 > > > > > > tmp =3D leaf->bitmap & mask; /* All are u64 */ > > > > > >=20 > > > > > > expecting just the u64 bitmap to represent all the CPUs in the = leaf. > > > > > >=20 > > > > > > If we have, say 256 CPUs per DIE, we get shift(7) and arch_sbm_= mask > > > > > > as 7f (127) which allows a leaf to more than 64 CPUs but we are > > > > > > using the "u64 bitmap" directly and not: > > > > > >=20 > > > > > > find_next_bit(bitmap, arch_sbm_mask) > > > > > >=20 > > > > > > Am I missing something here? > > > > >=20 > > > > > Nope. That logic just isn't there, that was left as an exercise t= o the > > > > > reader :-) > > > >=20 > > > > Ack! Let me go fiddle with that. > > > >=20 > > >=20 > > > Nice catch. I hadn't noticed this since we have fewer than > > > 64 CPUs per die. Please feel free to send patches to me when > > > they're available. > > >=20 > > > And regarding your other question about the calculation of arch_sbm_s= hift, > > > I'm trying to understand why there is a subtraction of 1, should it b= e: > > > - arch_sbm_shift =3D x86_topo_system.dom_shifts[TOPO_DIE_DOMAIN= ] - 1; > > > + arch_sbm_shift =3D x86_topo_system.dom_shifts[TOPO_DIE_DOMAIN= - 1]; > >=20 > > Perhaps something like > >=20 > > arch_sbm_shift =3D min(sizeof(unsigned long), > > topology_get_domain_shift(TOPO_TILE_DOMAIN)); > >=20 > > to take care of both AMD system and the 64 bit leaf bitmask limit? > >=20 >=20 > Yes, this should be doable (Prateek has mentioned using TOPO_TILE_DOMAIN)= . > The only drawback I can think of is that if there are more than 64 CPUs > within a die, it is possible CPUs in different dies (LLCs) be indexed in > the same leaf and access the same mask,=C2=A0 >=20 First, I think I should have used=20 arch_sbm_shift =3D min(BITS_PER_LONG, topology_get_domain_shift(TOPO_TILE_DOMAIN)); I am assuming that we should choose TOPO_DIE_DOMAIN for Intel CPUs and TOPO_TILE_DOMAIN for AMD CPUs. And the assumption is that such domain choice will span one L3 (I think that's the case).=C2=A0 Then leaf domains smaller than the domain size will also only span one L3 by definition. So for the 128 CPUs example you gave, both leaves with CPU 0-63 and 64-127 will span the same LLC and we should not have cache bounce. Tim > which would still lead to cache > contention. Maybe we should allocate the leaf cpumask according to the > actual size of a die? >=20 > thanks, > Chenyu >=20 >=20