From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 278101A256E for ; Fri, 7 Nov 2025 23:32:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.15 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762558342; cv=none; b=XG2aHV6RWelvFGuBBGpsCZWWCMX2quRyjKX2/7YxTutf/LMEdB4LQbBkVRps8VnydCr7OSpCy9L4tRDsopMrXMV26o3e55YCz6kfpIWmCoLi1NGybWVmyxvMpoSagljZYfUBwFpGduTh/812iTvTYpvQjTS44WQxdEMITUT5FGg= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762558342; c=relaxed/simple; bh=lg95PqZ6UPEcke6CzDZMCmLXRUpVCM7TugkfOV8gbAQ=; h=From:Subject:Date:Message-Id:MIME-Version:Content-Type:To:Cc; b=r2vVn6w8GUE/Hkzhx/RnQq8S+fDFEl/bfRv5JeTOd05tAxyQ7c169LoWzo2t58/9tAj5CLr61kN1QN6jAbbLjoPidMRFf6Q6BsmmmvZWHo784uB7rITUxlcy038rxBSNH3bGz165fb2S+RJGtbCXMOke6FFNC4N2Sl6OqnsrMB4= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=YDgGTZs1; arc=none smtp.client-ip=192.198.163.15 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="YDgGTZs1" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1762558341; x=1794094341; h=from:subject:date:message-id:mime-version: content-transfer-encoding:to:cc; bh=lg95PqZ6UPEcke6CzDZMCmLXRUpVCM7TugkfOV8gbAQ=; b=YDgGTZs1hMMGK/Yxj6AqVSADgWU3dg5baB9B1jkU/tDLENXbCQ4Pv1kV XzU3jnB9M6DPWxLhXWgpO9SnROmX/zLOAGQ7b8W0pt745znyx+YAs2LnP F7jBqFuZX1VrJaxqj3OQND7/V27fTxWGTJA1Z3wsjaXqQEb9B862ROfVx R0OSIP5OlCvFbu6vKodQSX2f0eYXyuD43ZBMMWyYMr4+cfh0qmewOk0cs 5kqtbFQ02uo5cEz8ldgxS8Ft0a01xBdLQYx1qUGdlFVMaFomWur9XriXX yudr3ODiwbee979KyaTrjLamV/5XhcliU1o5VEoDUAIkcaWebWd11D9QY w==; X-CSE-ConnectionGUID: XKtT+2T0Ste9Hq0cZIcOGw== X-CSE-MsgGUID: yi8OIXM1SSKBO45yzql3YA== X-IronPort-AV: E=McAfee;i="6800,10657,11606"; a="64806319" X-IronPort-AV: E=Sophos;i="6.19,288,1754982000"; d="scan'208";a="64806319" Received: from orviesa002.jf.intel.com ([10.64.159.142]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Nov 2025 15:32:19 -0800 X-CSE-ConnectionGUID: 1PAhrkUDTT+tq+R4DQGeUg== X-CSE-MsgGUID: w2gNIv08QSOMKAsh2osHjg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.19,288,1754982000"; d="scan'208";a="218815413" Received: from orcnseosdtjek.jf.intel.com (HELO [10.166.28.90]) ([10.166.28.90]) by orviesa002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Nov 2025 15:32:19 -0800 From: Jacob Keller Subject: [PATCH iwl-next v3 0/9] ice: properly use u64_stats API for all ring stats Date: Fri, 07 Nov 2025 15:31:44 -0800 Message-Id: <20251107-jk-refactor-queue-stats-v3-0-771ae1414b2e@intel.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-B4-Tracking: v=1; b=H4sIAGKBDmkC/3XNTQ6CMBAF4KuYrh3TTmkBV97DuCh00PoDSgtiD He36UZjwvLlzfvmzTz1jjzbrt6sp9F517UxyPWK1SfTHgmcjZkhRyW40HC+QE+NqUPXw2OggcA HEzyUlKOoZFYTFyyu7/HKTUneM/e8QktTYIfYnJyP41d6OYrUJ11wuaiPAjgInVkkW6iy0jvXB rpu6u6WzBF/HbXsYHQKrVCpvClVjv+O/Doci2VHRsfKvJEGM6Or7NeZ5/kDZQsBMFgBAAA= X-Change-ID: 20251016-jk-refactor-queue-stats-9e721b34ce01 To: Aleksandr Loktionov , Alexander Lobakin , Tony Nguyen , Przemek Kitszel Cc: Simon Horman , intel-wired-lan@lists.osuosl.org, netdev@vger.kernel.org, Jacob Keller , Aleksandr Loktionov X-Mailer: b4 0.15-dev-f4b34 X-Developer-Signature: v=1; a=openpgp-sha256; l=5320; i=jacob.e.keller@intel.com; h=from:subject:message-id; bh=lg95PqZ6UPEcke6CzDZMCmLXRUpVCM7TugkfOV8gbAQ=; b=owGbwMvMwCWWNS3WLp9f4wXjabUkhky+xhrrNcF3ZjjfuNyql7+zPdumb3f2x7W1u3NS/0ctS U4xPVTbUcrCIMbFICumyKLgELLyuvGEMK03znIwc1iZQIYwcHEKwEQ+BjMyLA5e51xkec0iJPa7 c+5mpu49W9p/73AQiI2NdUianFSgzvBP51Flb+F++40ufw6tmHd9Y8xPFantvjdt9ujYdYX84N7 HCQA= X-Developer-Key: i=jacob.e.keller@intel.com; a=openpgp; fpr=204054A9D73390562AEC431E6A965D3E6F0F28E8 The ice driver has multiple u64 values stored in the ring structures for each queue used for statistics. These are accumulated in ice_update_vsi_stats(). The packet and byte values are read using the u64_stats API from . Several non-standard counters are also accumulated in the same function, but do not use the u64_stats API. This could result in load/store tears on 32-bit architectures. Further, since commit 316580b69d0a ("u64_stats: provide u64_stats_t type"), the u64 stats API has had u64_stats_t and access functions which convert to local64_t on 64-bit architectures. The ice driver doesn't use u64_stats_t and these access functions. Thus even on 64-bit architectures it could read inconsistent values. This series refactors the ice driver to use the updated API. Along the way I noticed several other issues and inconsistencies which I have cleaned up, summarized below. *) The driver never called u64_stats_init, leaving the syncp improperly initialized. Since the field is part of a kzalloc block, this only impacts 32-bit systems with CONFIG_LOCKDEP enabled. *) A few locations accessed the packets and byte counts directly without using the u64 stats API. *) The prev_pkt integer field is moved out of the stats structure and into the ice_tx_ring structure directly. *) Cache line comments in ice_tx_ring and ice_rx_ring were out of date and did not match the actual intended layout for systems with 64-bit cache lines. Convert the structures to use __cacheline_group instead of comments. *) The ice_fetch_u64_stats_per_ring() function took the ice_q_stats by value, defeating the point of using the u64_stats API entirely. To keep the stats increments short, I introduced ice_stats_inc, as otherwise each stat increment has to be quite verbose. Similarly a few places read only one stat, so I added ice_stats_read for those. This version uses struct ice_vsi_(tx|rx)_stats structures defined in ice_main.c for the accumulator. I haven't come up with a better solution that allows accumulating nicely without this structure. Its a bit frustrating as it copies the entries in the ring stats structures but with u64 instead of u64_stats_t. I am also still not entirely certain how the ice_update_vsi_ring_stats() function is synchronized in the ice driver. It is called from multiple places without an obvious synchronization mechanism. It is ultimately called from the service task and from ethtool, and I think it may also be called from one of the netdev stats callbacks. I'm open to suggestions on ways to improve this, as I think the result still has some ugly logic and a fair amount of near duplicate code. I have included the cacheline cleanup in ice_tx_ring and ice_rx_ring here, but that could arguably be split to its own series. I only noticed it because of attempting to move the prev_pkt field out of the ring stats. I replaced the comments with cacheline_group, but I did not make an attempt to optimize the existing cachelines. Probably we should experiment with the method used in idpf with the 'read-mostly', 'read-write' and 'cold' groupings, but doing so will require a more thorough deep dive on performance profiling and tuning. Signed-off-by: Jacob Keller --- Changes in v3: - Use SMP_CACHE_BYTES in assertions to avoid issues on ARM v7 with 128-byte cache (due to xdp_rxq_info changing size) - Only check the tx_lock cache group size for non-debug kernels, rather than keeping logic to check its size when DEBUG_LOCK_ALLOC is enabled. - Link to v2: https://patch.msgid.link/20251105-jk-refactor-queue-stats-v2-0-8652557f9572@intel.com Changes in v2: - Fix minor typos. - Link to v1: https://patch.msgid.link/20251103-jk-refactor-queue-stats-v1-0-164d2ed859b6@intel.com --- Jacob Keller (9): ice: initialize ring_stats->syncp ice: use cacheline groups for ice_rx_ring structure ice: use cacheline groups for ice_tx_ring structure ice: move prev_pkt from ice_txq_stats to ice_tx_ring ice: pass pointer to ice_fetch_u64_stats_per_ring ice: remove ice_q_stats struct and use struct_group ice: use u64_stats API to access pkts/bytes in dim sample ice: shorten ring stat names and add accessors ice: convert all ring stats to u64_stats_t drivers/net/ethernet/intel/ice/ice.h | 3 - drivers/net/ethernet/intel/ice/ice_lib.h | 6 + drivers/net/ethernet/intel/ice/ice_txrx.h | 135 ++++++++++++----- drivers/net/ethernet/intel/ice/ice_txrx_lib.h | 2 +- drivers/net/ethernet/intel/ice/ice_base.c | 4 +- drivers/net/ethernet/intel/ice/ice_ethtool.c | 30 ++-- drivers/net/ethernet/intel/ice/ice_lib.c | 61 ++++++-- drivers/net/ethernet/intel/ice/ice_main.c | 201 +++++++++++++++++--------- drivers/net/ethernet/intel/ice/ice_txrx.c | 45 +++--- drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 2 +- drivers/net/ethernet/intel/ice/ice_xsk.c | 4 +- 11 files changed, 331 insertions(+), 162 deletions(-) --- base-commit: c5acd2847bb5371988d6b5c165b0776967cc7da2 change-id: 20251016-jk-refactor-queue-stats-9e721b34ce01 Best regards, -- Jacob Keller