From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp3.osuosl.org (smtp3.osuosl.org [140.211.166.136]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8B5F22C0272 for ; Thu, 13 Nov 2025 03:32:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=140.211.166.136 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763004765; cv=none; b=e1vU5T7k2oQ4EgMwv4zp4f/KhoU70RkxtWMQE2t2MCOormOcTtWZ+aoLRqeN4qaiWO7ouskP/kjUjO0/R9mcUdD24eEiuecBvZxcsKlWFKLJFbHf3d3Lv4BfuSxngtUzoJ2uqN9HoUgLPNktqHT/HgBXxfS55hRPORmtO84REX0= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763004765; c=relaxed/simple; bh=nW8dYWJRACfO9SxIG8/7eN3wi1ZlRG3WcV3UoKJDPTM=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=kHxAOuys1mqeeGDuiI42KVSJAEIIWu1iOXfnbTXTs9wfacuc9mtUI0x2AqE+oI1U3nRsrBUcdSUtgDvUPuIBplYNozBtiuxDkXQOI3b7Mm3L2BX4EpPWoztcZ9oKUXpU+XZCr9zkbETM5DtTvvQE122IKlc0GhiMhzu8LG/23xc= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=hPjChoR+; arc=none smtp.client-ip=140.211.166.136 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="hPjChoR+" Received: from localhost (localhost [127.0.0.1]) by smtp3.osuosl.org (Postfix) with ESMTP id 9EF9260720 for ; Thu, 13 Nov 2025 03:32:42 +0000 (UTC) X-Virus-Scanned: amavis at osuosl.org X-Spam-Flag: NO X-Spam-Score: -8.092 X-Spam-Level: Received: from smtp3.osuosl.org ([127.0.0.1]) by localhost (smtp3.osuosl.org [127.0.0.1]) (amavis, port 10024) with ESMTP id 4ZBHbNSuH8op for ; Thu, 13 Nov 2025 03:32:42 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=192.198.163.12; helo=mgamail.intel.com; envelope-from=wangyang.guo@intel.com; receiver= DMARC-Filter: OpenDMARC Filter v1.4.2 smtp3.osuosl.org 077EE60693 Authentication-Results: smtp3.osuosl.org; dmarc=pass (p=none dis=none) header.from=intel.com DKIM-Filter: OpenDKIM Filter v2.11.0 smtp3.osuosl.org 077EE60693 Authentication-Results: smtp3.osuosl.org; dkim=pass (2048-bit key, unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=hPjChoR+ Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.12]) by smtp3.osuosl.org (Postfix) with ESMTPS id 077EE60693 for ; Thu, 13 Nov 2025 03:32:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1763004762; x=1794540762; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=nW8dYWJRACfO9SxIG8/7eN3wi1ZlRG3WcV3UoKJDPTM=; b=hPjChoR+bB1cTwcgZkkxGQkeTDkEg7xrGO5hE50J+dUkPusF/ftqHYhK Gnk86Rc9PU3Ivj0iyiReNcz79P/SeRdz10bawA10xgA8jZqo7npI0FymA G6Sa6doCen5Y0Aym81MfsWhiPqLUGmbSG/n8krpIVoC125SbzlMxsPWJJ JX2xBbNpCikEZWzfxzpjF67RYkTA0ViwUubauaieNOD+aMiqcBe8k2vZz AnED/shpSNysLOxNuVWdhleKDi1jt+ENM574PytC4PM+s9pNmajuq9gq+ y23SQGBLFm+gVepjKT3xKfxH8TqnxJHWKRmty4WttsjvX1UWcmtB/gFX4 Q==; X-CSE-ConnectionGUID: +qxVxG5nTGCPPU4mdd3WwQ== X-CSE-MsgGUID: STZt6bW+T+q6+Sb/CCkPsg== X-IronPort-AV: E=McAfee;i="6800,10657,11611"; a="68943555" X-IronPort-AV: E=Sophos;i="6.19,300,1754982000"; d="scan'208";a="68943555" Received: from orviesa003.jf.intel.com ([10.64.159.143]) by fmvoesa106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Nov 2025 19:32:41 -0800 X-CSE-ConnectionGUID: O1uxeBX2Sa+cL9lsKqBhsA== X-CSE-MsgGUID: s7OouAHqR/OR+ATVTBdb+g== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.19,300,1754982000"; d="scan'208";a="193512952" Received: from unknown (HELO [10.238.2.7]) ([10.238.2.7]) by ORVIESA003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Nov 2025 19:32:38 -0800 Message-ID: <2ca29d50-88c4-4f4a-afc6-4b79700004e3@intel.com> Date: Thu, 13 Nov 2025 11:32:34 +0800 Precedence: bulk X-Mailing-List: virtualization@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH RESEND] lib/group_cpus: make group CPU cluster aware To: Ming Lei Cc: Andrew Morton , Thomas Gleixner , Keith Busch , Jens Axboe , Christoph Hellwig , Sagi Grimberg , linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, virtualization@lists.linux-foundation.org, linux-block@vger.kernel.org, Tianyou Li , Tim Chen , Dan Liang References: <20251111020608.1501543-1-wangyang.guo@intel.com> Content-Language: en-US From: "Guo, Wangyang" In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit On 11/13/2025 9:38 AM, Ming Lei wrote: > On Wed, Nov 12, 2025 at 11:02:47AM +0800, Guo, Wangyang wrote: >> On 11/11/2025 8:08 PM, Ming Lei wrote: >>> On Tue, Nov 11, 2025 at 01:31:04PM +0800, Guo, Wangyang wrote: >>>> On 11/11/2025 11:25 AM, Ming Lei wrote: >>>>> On Tue, Nov 11, 2025 at 10:06:08AM +0800, Wangyang Guo wrote: >>>>>> As CPU core counts increase, the number of NVMe IRQs may be smaller than >>>>>> the total number of CPUs. This forces multiple CPUs to share the same >>>>>> IRQ. If the IRQ affinity and the CPU’s cluster do not align, a >>>>>> performance penalty can be observed on some platforms. >>>>> >>>>> Can you add details why/how CPU cluster isn't aligned with IRQ >>>>> affinity? And how performance penalty is caused? >>>> >>>> Intel Xeon E platform packs 4 CPU cores as 1 module (cluster) and share the >>>> L2 cache. Let's say, if there are 40 CPUs in 1 NUMA domain and 11 IRQs to >>>> dispatch. The existing algorithm will map first 7 IRQs each with 4 CPUs and >>>> remained 4 IRQs each with 3 CPUs each. The last 4 IRQs may have cross >>>> cluster issue. For example, the 9th IRQ which pinned to CPU32, then for >>>> CPU31, it will have cross L2 memory access. >>> >>> >>> CPUs sharing L2 usually have small number, and it is common to see one queue >>> mapping includes CPUs from different L2. >>> >>> So how much does crossing L2 hurt IO perf? >> We see 15%+ performance difference in FIO libaio/randread/bs=8k. > > As I mentioned, it is common to see CPUs crossing L2 in same group, but why > does it make a difference here? You mentioned just some platforms are > affected. We observed the performance difference in Intel Xeon E platform which has 4 physical CPU cores as 1 module (cluster) sharing the same L2 cache. For other platforms like Intel P-core or AMD, I think it's hard to show performance benefit with L2 locality, because: 1. L2 cache is only shared within 2 logic core when HT enabled 2. If IRQ pinned to corresponding HT core, the L2 cache locality is good, but other aspects like retiring maybe affected since they are sharing the same physical CPU resources. BR Wangyang