From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BBD113D5252 for ; Thu, 26 Feb 2026 22:11:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.17 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772143912; cv=none; b=IOTHlK9XXqCdqzZOrVpixsH8iU4+WTF82WCi04r+xGZupst2PekFc/B/qMaUBZGTHb3XIdwo8bLYw5jdc22cdiZuRoGOYoN3IHwaoGOFYImurMw2aXitiyfsN5iGwDV0faHAibsjgytrOHQ//9EKdWXB0DN/TkNy7FO3fxn27/w= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772143912; c=relaxed/simple; bh=cpScilSfV3wL+RxwHXuH7GRpa+1vS4C4k7r+1cMYrFM=; h=Message-ID:Subject:From:To:Cc:Date:In-Reply-To:References: Content-Type:MIME-Version; b=fYNFSJc1VU86iOz8cP2h7zwdtn5tPwtWKftFdz4pruAvy9De3y7nOD9hrcOmEM8XnWWmX8ucYgebBJxX0H0toeDlfoQoDCf65/nT5/7nNPNWrlbLItxFgrSoBHOeFKHgGM6ndHZS2k+OCiuZy1LO/BsRxBrUINN6M1jthPhBvSQ= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=pass smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=IUvH3zuL; arc=none smtp.client-ip=192.198.163.17 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="IUvH3zuL" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1772143911; x=1803679911; h=message-id:subject:from:to:cc:date:in-reply-to: references:content-transfer-encoding:mime-version; bh=cpScilSfV3wL+RxwHXuH7GRpa+1vS4C4k7r+1cMYrFM=; b=IUvH3zuLa0KyWBnrRGN863km+t55mGjIyx2pQdS/kTmOSabGZH13oHuq +tnb5ofmfDyfOvXNaOMPRNJB7ktHwiqXvyVHL0zO2AUlqf445z5LH3JiG EJuwX9AaRcSL6NyDbmhg0qoPHy1DDD+wBxh4sME/5Ujl6EVXZ/eicnkBP 5UnYpFgpxQ8MPiHIXeygclNq2JUG/lJfdPWCZFp2Y9T1iHN3oVovI7/rG sls7fxGkQYxO/hF8dAbnnej04RPvWzrgBshekE72M8qk+Lzr+6/nGGuxP rPdfqbCFu0Q+eei5EJjNK5ZrbYSLfnSNs9QLtH5Lu2BzEHL9ZDkfv1KL3 Q==; X-CSE-ConnectionGUID: k+4KGeqKRDi+LJ+Mj/ksLw== X-CSE-MsgGUID: xaO6D3NxQEKr65BHGEHuvw== X-IronPort-AV: E=McAfee;i="6800,10657,11713"; a="73130440" X-IronPort-AV: E=Sophos;i="6.21,313,1763452800"; d="scan'208";a="73130440" Received: from orviesa009.jf.intel.com ([10.64.159.149]) by fmvoesa111.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Feb 2026 14:11:50 -0800 X-CSE-ConnectionGUID: faFgcWkkT3WlL6k+1UUiTw== X-CSE-MsgGUID: i7P8uOiFR3yvrZhmtFDyUg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.21,313,1763452800"; d="scan'208";a="216612156" Received: from schen9-mobl4.amr.corp.intel.com (HELO [10.125.111.161]) ([10.125.111.161]) by orviesa009-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Feb 2026 14:11:50 -0800 Message-ID: <641cb6457fcbb6509351ecb45bdff300540f9f55.camel@linux.intel.com> Subject: Re: [RFC][PATCH 5/6] x86/topo: Fix SNC topology mess From: Tim Chen To: "Chen, Yu C" , Peter Zijlstra Cc: linux-kernel@vger.kernel.org, kyle.meyer@hpe.com, vinicius.gomes@intel.com, brgerst@gmail.com, hpa@zytor.com, kprateek.nayak@amd.com, patryk.wlazlyn@linux.intel.com, rafael.j.wysocki@intel.com, russ.anderson@hpe.com, zhao1.liu@intel.com, tony.luck@intel.com, x86@kernel.org, tglx@kernel.org Date: Thu, 26 Feb 2026 14:11:49 -0800 In-Reply-To: References: <20260226104909.675623579@infradead.org> <20260226105052.737712686@infradead.org> <334d4edb-b8f4-41fb-aa16-6cb7abeaa21d@intel.com> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.58.1 (3.58.1-1.fc43) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 On Thu, 2026-02-26 at 11:00 -0800, Tim Chen wrote: > On Fri, 2026-02-27 at 01:07 +0800, Chen, Yu C wrote: > > Hi Peter, > >=20 > > On 2/26/2026 6:49 PM, Peter Zijlstra wrote: > > > + int u =3D __num_nodes_per_package; > >=20 > > Yes, this is much simpler, thanks for the patch! > >=20 > > > + long d =3D 0; > > > + int x, y; > > > + > > > + /* > > > + * Is this a unit cluster on the trace? > > > + */ > > > + if ((i / u) =3D=3D (j / u)) > > > + return node_distance(i, j); > >=20 > > If the number of nodes per package is 3, we assume that > > every 3 consecutive nodes are SNC siblings (on the same > > trace):node0, node1, and node2 are SNC siblings, while > > node3, node4, and node5 form another group of SNC siblings. > >=20 > > I have a curious thought: could it be possible that > > node0, node2, and node4 are SNC siblings, and node1, > > node3, and node5 are another set of SNC siblings instead? > >=20 > > Then I studied the code a little more, node ids are dynamically > > allocated via the acpi_map_pxm_to_node, so the assignment of node > > ids depends on the order in which each processor affinity structure > > is listed in the SRAT table. For example, suppose CPU0 belongs to > > package0 and CPU1 belongs to package1, but their entries are placed > > consecutively in the SRAT. In this case, the Proximity Domain of > > CPU0 would be mapped to node0 via acpi_map_pxm_to_node, and CPU1=E2=80= =99s > > Proximity Domain would be assigned node1. The logic above would > > then treat them as belonging to the same package, even though they > > are physically in different packages. However, I believe such a > > scenario is unlikely to occur in practice in the BIOS and if it > > happens it should be a BIOS bug if I understand correctly. > >=20 > >=20 >=20 > May be a good idea to sanity check that the nodes in the first unit clust= er > has the same package id and give a WARNING if that's not the case. >=20 Perhaps something like below Tim --- diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index d97f8f4e014c..38384ea5253a 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -511,6 +511,24 @@ static int slit_cluster_distance(int i, int j) int u =3D __num_nodes_per_package; long d =3D 0; int x, y; + static int valid_slit =3D 0; + + if (valid_slit =3D=3D -1) + return node_distance(i, j); + + if (valid_slit =3D=3D 0) { + /* Check first nodes in package are grouped together consecutively */ + for (x =3D 0; x < u-1 ; x++) { + if (topology_physical_package_id(x) !=3D + topology_physical_package_id(x+1)) { + pr_warn("Expect nodes %d and %d to be in the same package", + x, x+1); + valid_slit =3D -1; + return node_distance(i, j); + } + } + valid_slit =3D 1; + } =20 /* * Is this a unit cluster on the trace?