intel-wired-lan.osuosl.org archive mirror
 help / color / mirror / Atom feed
From: Jacob Keller <jacob.e.keller@intel.com>
To: Aleksandr Loktionov <aleksandr.loktionov@intel.com>,
	 Alexander Lobakin <aleksander.lobakin@intel.com>,
	 Tony Nguyen <anthony.l.nguyen@intel.com>,
	 Przemek Kitszel <przemyslaw.kitszel@intel.com>
Cc: intel-wired-lan@lists.osuosl.org, netdev@vger.kernel.org,
	 Jacob Keller <jacob.e.keller@intel.com>
Subject: [Intel-wired-lan] [PATCH iwl-next 3/9] ice: use cacheline groups for ice_tx_ring structure
Date: Mon, 03 Nov 2025 17:06:48 -0800	[thread overview]
Message-ID: <20251103-jk-refactor-queue-stats-v1-3-164d2ed859b6@intel.com> (raw)
In-Reply-To: <20251103-jk-refactor-queue-stats-v1-0-164d2ed859b6@intel.com>

The ice ring structure was reorganized by commit 65124bbf980c ("ice:
Reorganize tx_buf and ring structs"), and later split into a separate
ice_tx_ring structure by commit e72bba21355d ("ice: split ice_ring onto
Tx/Rx separate structs").o

The ice_tx_ring structure has comments left over from this reorganization
and split indicating which fields are supposed to belong to which
cachelines. Unfortunately, these comments are almost completely incorrect.

 * Cacheline 1 spans from the start of the structure to the vsi pointer.
   This cacheline is correct, and appears to be the only one that is.

 * Cacheline 2 spans from the DMA address down to the xps_state field. The
   comments indicate it should end at the rcu head field.

 * Cacheline 3 spans from the ice_channel pointer to the end of the struct,
   and completely contains what is marked as a separate 4th cacheline.

The use of such comments to indicate cache lines is error prone. It is
extremely likely that the comments will become out of date with future
refactors. Instead, use __cacheline_group_(begin|end)_aligned() which is
more explicit. It guarantees that members between the cacheline groups will
be in distinct cache lines through the use of padding. It additionally
enables compile time assertions to help prevent new fields from drastically
re-arranging the cache lines.

There are two main issues if we just replace the existing comments with
cache line group markers. First, the spinlock_t tx_lock field is 24 bytes
on most kernels, but is 64 bytes on CONFIG_DEBUG_LOCK_ALLOC kernels.
Ideally we want to place this field at the start of a cacheline so that
other fields in the group don't get split across such a debug kernel. While
optimizing such a debug kernel is not a priority, doing this makes the
assertions around the cacheline a bit easier to understand.

Remove the out-of-date cacheline comments, and add __cacheline_group
annotations. These are set to match the existing layout instead of matching
the original comments. The only change to layout is to re-order the tx_lock
to be the start of cacheline 3, and move txq_teid to avoid a 4-byte gap in
the layout.

Ideally, we should profile the Tx hot path and figure out which fields go
together and re-arrange the cacheline groups, possibly along "read_mostly",
"readwrite" and "cold" groupings similar to the idpf driver. This has been
left as an exercise for a later improvement.

Finally, add annotations which check the cacheline sizes. For cacheline 3,
we enforce that tx_lock is in this cacheline group, and check the size
based on whether or not the CONFIG_DEBUG_LOCK_ALLOC is enabled. Similar to
the Rx annotations, these check that the size of each cacheline group
(excluding padding) is no greater than 64 bytes. This is primarily intended
to produce compiler failures if developers add or re-arrange fields such
that cacheline groups exceed the expected 64 byte sizes on x86_64 systems.
Because the assertions check the size excluding any padding, they should
behave the same even on systems with larger L1 cacheline sizes.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_txrx.h | 32 +++++++++++++++++++++++++------
 drivers/net/ethernet/intel/ice/ice_main.c |  1 +
 2 files changed, 27 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.h b/drivers/net/ethernet/intel/ice/ice_txrx.h
index 6c708caf3a91..5350eb832ee5 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx.h
+++ b/drivers/net/ethernet/intel/ice/ice_txrx.h
@@ -319,7 +319,7 @@ static inline void ice_rx_ring_struct_check(void)
 }
 
 struct ice_tx_ring {
-	/* CL1 - 1st cacheline starts here */
+	__cacheline_group_begin_aligned(cl1);
 	struct ice_tx_ring *next;	/* pointer to next ring in q_vector */
 	void *desc;			/* Descriptor ring memory */
 	struct device *dev;		/* Used for DMA mapping */
@@ -328,7 +328,9 @@ struct ice_tx_ring {
 	struct ice_q_vector *q_vector;	/* Backreference to associated vector */
 	struct net_device *netdev;	/* netdev ring maps to */
 	struct ice_vsi *vsi;		/* Backreference to associated VSI */
-	/* CL2 - 2nd cacheline starts here */
+	__cacheline_group_end_aligned(cl1);
+
+	__cacheline_group_begin_aligned(cl2);
 	dma_addr_t dma;			/* physical address of ring */
 	struct xsk_buff_pool *xsk_pool;
 	u16 next_to_use;
@@ -340,15 +342,18 @@ struct ice_tx_ring {
 	u16 xdp_tx_active;
 	/* stats structs */
 	struct ice_ring_stats *ring_stats;
-	/* CL3 - 3rd cacheline starts here */
 	struct rcu_head rcu;		/* to avoid race on free */
 	DECLARE_BITMAP(xps_state, ICE_TX_NBITS);	/* XPS Config State */
+	__cacheline_group_end_aligned(cl2);
+
+	__cacheline_group_begin_aligned(cl3);
+	spinlock_t tx_lock;
 	struct ice_channel *ch;
 	struct ice_ptp_tx *tx_tstamps;
-	spinlock_t tx_lock;
-	u32 txq_teid;			/* Added Tx queue TEID */
-	/* CL4 - 4th cacheline starts here */
 	struct ice_tstamp_ring *tstamp_ring;
+
+	u32 txq_teid;			/* Added Tx queue TEID */
+
 #define ICE_TX_FLAGS_RING_XDP		BIT(0)
 #define ICE_TX_FLAGS_RING_VLAN_L2TAG1	BIT(1)
 #define ICE_TX_FLAGS_RING_VLAN_L2TAG2	BIT(2)
@@ -356,8 +361,23 @@ struct ice_tx_ring {
 	u8 flags;
 	u8 dcb_tc;			/* Traffic class of ring */
 	u16 quanta_prof_id;
+	__cacheline_group_end_aligned(cl3);
 } ____cacheline_internodealigned_in_smp;
 
+static inline void ice_tx_ring_struct_check(void)
+{
+	CACHELINE_ASSERT_GROUP_SIZE(struct ice_tx_ring, cl1, 64);
+	CACHELINE_ASSERT_GROUP_SIZE(struct ice_tx_ring, cl2, 64);
+
+	CACHELINE_ASSERT_GROUP_MEMBER(struct ice_tx_ring, cl3, tx_lock);
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+	/* spinlock_t is larger on DEBUG_LOCK_ALLOC kernels */
+	CACHELINE_ASSERT_GROUP_SIZE(struct ice_tx_ring, cl3, 128);
+#else
+	CACHELINE_ASSERT_GROUP_SIZE(struct ice_tx_ring, cl3, 64);
+#endif
+}
+
 static inline bool ice_ring_ch_enabled(struct ice_tx_ring *ring)
 {
 	return !!ring->ch;
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index 4731dbaca9de..645a2113e8aa 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -5917,6 +5917,7 @@ static int __init ice_module_init(void)
 	int status = -ENOMEM;
 
 	ice_rx_ring_struct_check();
+	ice_tx_ring_struct_check();
 
 	pr_info("%s\n", ice_driver_string);
 	pr_info("%s\n", ice_copyright);

-- 
2.51.0.rc1.197.g6d975e95c9d7


  parent reply	other threads:[~2025-11-04  1:07 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-04  1:06 [Intel-wired-lan] [PATCH iwl-next 0/9] ice: properly use u64_stats API for all ring stats Jacob Keller
2025-11-04  1:06 ` [Intel-wired-lan] [PATCH iwl-next 1/9] ice: initialize ring_stats->syncp Jacob Keller
2025-11-04  7:22   ` Loktionov, Aleksandr
2025-11-04  1:06 ` [Intel-wired-lan] [PATCH iwl-next 2/9] ice: use cacheline groups for ice_rx_ring structure Jacob Keller
2025-11-04  7:23   ` Loktionov, Aleksandr
2025-11-04  1:06 ` Jacob Keller [this message]
2025-11-04  7:30   ` [Intel-wired-lan] [PATCH iwl-next 3/9] ice: use cacheline groups for ice_tx_ring structure Loktionov, Aleksandr
2025-11-04 18:40     ` Jacob Keller
2025-11-04  1:06 ` [Intel-wired-lan] [PATCH iwl-next 4/9] ice: move prev_pkt from ice_txq_stats to ice_tx_ring Jacob Keller
2025-11-04  1:06 ` [Intel-wired-lan] [PATCH iwl-next 5/9] ice: pass pointer to ice_fetch_u64_stats_per_ring Jacob Keller
2025-11-04  7:23   ` Loktionov, Aleksandr
2025-11-04  1:06 ` [Intel-wired-lan] [PATCH iwl-next 6/9] ice: remove ice_q_stats struct and use struct_group Jacob Keller
2025-11-04  7:24   ` Loktionov, Aleksandr
2025-11-04  1:06 ` [Intel-wired-lan] [PATCH iwl-next 7/9] ice: use u64_stats API to access pkts/bytes in dim sample Jacob Keller
2025-11-04  7:24   ` Loktionov, Aleksandr
2025-11-04  1:06 ` [Intel-wired-lan] [PATCH iwl-next 8/9] ice: shorten ring stat names and add accessors Jacob Keller
2025-11-04  7:29   ` Loktionov, Aleksandr
2025-11-04  1:06 ` [Intel-wired-lan] [PATCH iwl-next 9/9] ice: convert all ring stats to u64_stats_t Jacob Keller
2025-11-04  7:25   ` Loktionov, Aleksandr
2025-11-04  7:27 ` [Intel-wired-lan] [PATCH iwl-next 0/9] ice: properly use u64_stats API for all ring stats Loktionov, Aleksandr
2025-11-04 18:45   ` Jacob Keller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251103-jk-refactor-queue-stats-v1-3-164d2ed859b6@intel.com \
    --to=jacob.e.keller@intel.com \
    --cc=aleksander.lobakin@intel.com \
    --cc=aleksandr.loktionov@intel.com \
    --cc=anthony.l.nguyen@intel.com \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=netdev@vger.kernel.org \
    --cc=przemyslaw.kitszel@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).