netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH iwl-next v2 0/9] igc: Add support for Frame Preemption feature in IGC
@ 2025-02-05 10:05 Faizal Rahim
  2025-02-05 10:05 ` [PATCH iwl-next v2 1/9] net: ethtool: mm: extract stmmac verification logic into common library Faizal Rahim
                   ` (8 more replies)
  0 siblings, 9 replies; 14+ messages in thread
From: Faizal Rahim @ 2025-02-05 10:05 UTC (permalink / raw)
  To: Tony Nguyen, Przemek Kitszel, Andrew Lunn, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Maxime Coquelin,
	Alexandre Torgue, Simon Horman, Russell King, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Russell King, Furong Xu, Vladimir Oltean, Serge Semin,
	Xiaolei Wang, Suraj Jaiswal, Kory Maincent, Petr Tesarik,
	Faizal Rahim, Choong Yong Liang, Kunihiko Hayashi,
	Vinicius Costa Gomes, intel-wired-lan, netdev, linux-kernel,
	linux-stm32, linux-arm-kernel, bpf

Introduces support for the FPE feature in the IGC driver.

The patches aligns with the upstream FPE API:
https://patchwork.kernel.org/project/netdevbpf/cover/20230220122343.1156614-1-vladimir.oltean@nxp.com/
https://patchwork.kernel.org/project/netdevbpf/cover/20230119122705.73054-1-vladimir.oltean@nxp.com/

It builds upon earlier work:
https://patchwork.kernel.org/project/netdevbpf/cover/20220520011538.1098888-1-vinicius.gomes@intel.com/

The patch series adds the following functionalities to the IGC driver:
a) Configure FPE using `ethtool --set-mm`.
b) Display FPE settings via `ethtool --show-mm`.
c) View FPE statistics using `ethtool --include-statistics --show-mm'.
e) Enable preemptible/express queue with `fp`:
   tc qdisc add ... root taprio \
   fp E E P P

Change Log:
v1 -> v2:
- Extract the stmmac verification logic into a common library (Vladimir)
- igc to use common library for verification (Vladimir)
- Fix syntax for kernel-doc to use "Return:" (Vladimir)
- Use FIELD_GET instead of manual bit masking (Vladimir)
- Don't assign 0 to statistics counter in igc_ethtool_get_mm_stats() (Vladimir)
- Use pmac-enabled as a condition to allow MAC address value 0 (Vladimir)
- Define macro register value in increasing value order (Vladimir)
- Fix tx-min-frag-size handling for igc (Vladimir)
- Handle link state changes with verification in igc (Vladimir)
- Add static key for fast path code (Vladimir)
- rx_min_frag_size get from constant (Vladimir)

v1: https://patchwork.kernel.org/project/netdevbpf/cover/20241216064720.931522-1-faizal.abdul.rahim@linux.intel.com/

Faizal Rahim (8):
  igc: Rename xdp_get_tx_ring() for non-xdp usage
  igc: Optimize the TX packet buffer utilization
  igc: Set the RX packet buffer size for TSN mode
  igc: Add support for frame preemption verification
  igc: Add support to set tx-min-frag-size
  igc: Add support for preemptible traffic class in taprio
  igc: Add support to get MAC Merge data via ethtool
  igc: Add support to get frame preemption statistics via ethtool

Vladimir Oltean (1):
  net: ethtool: mm: extract stmmac verification logic into common
    library

 drivers/net/ethernet/intel/igc/igc.h          |  17 +-
 drivers/net/ethernet/intel/igc/igc_defines.h  |  19 +-
 drivers/net/ethernet/intel/igc/igc_ethtool.c  |  74 ++++++
 drivers/net/ethernet/intel/igc/igc_main.c     |  96 +++++++-
 drivers/net/ethernet/intel/igc/igc_regs.h     |  16 ++
 drivers/net/ethernet/intel/igc/igc_tsn.c      | 210 +++++++++++++++-
 drivers/net/ethernet/intel/igc/igc_tsn.h      |  34 +++
 drivers/net/ethernet/stmicro/stmmac/stmmac.h  |  16 +-
 .../ethernet/stmicro/stmmac/stmmac_ethtool.c  |  41 +---
 .../net/ethernet/stmicro/stmmac/stmmac_fpe.c  | 174 +++-----------
 .../net/ethernet/stmicro/stmmac/stmmac_fpe.h  |   5 -
 .../net/ethernet/stmicro/stmmac/stmmac_main.c |   8 +-
 include/linux/ethtool.h                       |  62 +++++
 net/ethtool/mm.c                              | 225 +++++++++++++++++-
 14 files changed, 780 insertions(+), 217 deletions(-)

--
2.34.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH iwl-next v2 1/9] net: ethtool: mm: extract stmmac verification logic into common library
  2025-02-05 10:05 [PATCH iwl-next v2 0/9] igc: Add support for Frame Preemption feature in IGC Faizal Rahim
@ 2025-02-05 10:05 ` Faizal Rahim
  2025-02-05 10:05 ` [PATCH iwl-next v2 2/9] igc: Rename xdp_get_tx_ring() for non-xdp usage Faizal Rahim
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: Faizal Rahim @ 2025-02-05 10:05 UTC (permalink / raw)
  To: Tony Nguyen, Przemek Kitszel, Andrew Lunn, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Maxime Coquelin,
	Alexandre Torgue, Simon Horman, Russell King, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Russell King, Furong Xu, Vladimir Oltean, Serge Semin,
	Xiaolei Wang, Suraj Jaiswal, Kory Maincent, Petr Tesarik,
	Faizal Rahim, Choong Yong Liang, Kunihiko Hayashi,
	Vinicius Costa Gomes, intel-wired-lan, netdev, linux-kernel,
	linux-stm32, linux-arm-kernel, bpf

From: Vladimir Oltean <vladimir.oltean@nxp.com>

It appears that stmmac is not the only hardware which requires a
software-driven verification state machine for the MAC Merge layer.

While on the one hand it's good to encourage hardware implementations,
on the other hand it's quite difficult to tolerate multiple drivers
implementing independently fairly non-trivial logic.

Extract the hardware-independent logic from stmmac into library code and
put it in ethtool. Name the state structure "mmsv" for MAC Merge
Software Verification. Let this expose an operations structure for
executing the hardware stuff: sync hardware with the tx_active boolean
(result of verification process), enable/disable the pMAC, send mPackets,
notify library of external events (reception of mPackets), as well as
link state changes.

Note that it is assumed that the external events are received in hardirq
context. If they are not, it is probably a good idea to disable hardirqs
when calling ethtool_mmsv_event_handle(), because the library does not
do so.

Also, the MM software verification process has no business with the
tx_min_frag_size, that is all the driver's to handle.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Co-developed-by: Choong Yong Liang <yong.liang.choong@linux.intel.com>
Signed-off-by: Choong Yong Liang <yong.liang.choong@linux.intel.com>
Co-developed-by: Faizal Rahim <faizal.abdul.rahim@linux.intel.com>
Signed-off-by: Faizal Rahim <faizal.abdul.rahim@linux.intel.com>
Tested-by: Choong Yong Liang <yong.liang.choong@linux.intel.com>
---
 drivers/net/ethernet/stmicro/stmmac/stmmac.h  |  16 +-
 .../ethernet/stmicro/stmmac/stmmac_ethtool.c  |  41 +---
 .../net/ethernet/stmicro/stmmac/stmmac_fpe.c  | 174 +++-----------
 .../net/ethernet/stmicro/stmmac/stmmac_fpe.h  |   5 -
 .../net/ethernet/stmicro/stmmac/stmmac_main.c |   8 +-
 include/linux/ethtool.h                       |  62 +++++
 net/ethtool/mm.c                              | 223 ++++++++++++++++++
 7 files changed, 329 insertions(+), 200 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
index f05cae103d83..c9cc41af258a 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
@@ -147,21 +147,9 @@ struct stmmac_channel {
 };
 
 struct stmmac_fpe_cfg {
-	/* Serialize access to MAC Merge state between ethtool requests
-	 * and link state updates.
-	 */
-	spinlock_t lock;
-
+	struct ethtool_mmsv mmsv;
 	const struct stmmac_fpe_reg *reg;
-	u32 fpe_csr;				/* MAC_FPE_CTRL_STS reg cache */
-
-	enum ethtool_mm_verify_status status;
-	struct timer_list verify_timer;
-	bool verify_enabled;
-	int verify_retries;
-	bool pmac_enabled;
-	u32 verify_time;
-	bool tx_enabled;
+	u32 fpe_csr;	/* MAC_FPE_CTRL_STS reg cache */
 };
 
 struct stmmac_tc_entry {
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
index 918a32f8fda8..8e6b052e00d6 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
@@ -1210,37 +1210,17 @@ static int stmmac_get_mm(struct net_device *ndev,
 			 struct ethtool_mm_state *state)
 {
 	struct stmmac_priv *priv = netdev_priv(ndev);
-	unsigned long flags;
 	u32 frag_size;
 
 	if (!stmmac_fpe_supported(priv))
 		return -EOPNOTSUPP;
 
-	spin_lock_irqsave(&priv->fpe_cfg.lock, flags);
+	ethtool_mmsv_get_mm(&priv->fpe_cfg.mmsv, state);
 
-	state->max_verify_time = STMMAC_FPE_MM_MAX_VERIFY_TIME_MS;
-	state->verify_enabled = priv->fpe_cfg.verify_enabled;
-	state->pmac_enabled = priv->fpe_cfg.pmac_enabled;
-	state->verify_time = priv->fpe_cfg.verify_time;
-	state->tx_enabled = priv->fpe_cfg.tx_enabled;
-	state->verify_status = priv->fpe_cfg.status;
 	state->rx_min_frag_size = ETH_ZLEN;
-
-	/* FPE active if common tx_enabled and
-	 * (verification success or disabled(forced))
-	 */
-	if (state->tx_enabled &&
-	    (state->verify_status == ETHTOOL_MM_VERIFY_STATUS_SUCCEEDED ||
-	     state->verify_status == ETHTOOL_MM_VERIFY_STATUS_DISABLED))
-		state->tx_active = true;
-	else
-		state->tx_active = false;
-
 	frag_size = stmmac_fpe_get_add_frag_size(priv);
 	state->tx_min_frag_size = ethtool_mm_frag_size_add_to_min(frag_size);
 
-	spin_unlock_irqrestore(&priv->fpe_cfg.lock, flags);
-
 	return 0;
 }
 
@@ -1248,8 +1228,6 @@ static int stmmac_set_mm(struct net_device *ndev, struct ethtool_mm_cfg *cfg,
 			 struct netlink_ext_ack *extack)
 {
 	struct stmmac_priv *priv = netdev_priv(ndev);
-	struct stmmac_fpe_cfg *fpe_cfg = &priv->fpe_cfg;
-	unsigned long flags;
 	u32 frag_size;
 	int err;
 
@@ -1258,23 +1236,8 @@ static int stmmac_set_mm(struct net_device *ndev, struct ethtool_mm_cfg *cfg,
 	if (err)
 		return err;
 
-	/* Wait for the verification that's currently in progress to finish */
-	timer_shutdown_sync(&fpe_cfg->verify_timer);
-
-	spin_lock_irqsave(&fpe_cfg->lock, flags);
-
-	fpe_cfg->verify_enabled = cfg->verify_enabled;
-	fpe_cfg->pmac_enabled = cfg->pmac_enabled;
-	fpe_cfg->verify_time = cfg->verify_time;
-	fpe_cfg->tx_enabled = cfg->tx_enabled;
-
-	if (!cfg->verify_enabled)
-		fpe_cfg->status = ETHTOOL_MM_VERIFY_STATUS_DISABLED;
-
+	ethtool_mmsv_set_mm(&priv->fpe_cfg.mmsv, cfg);
 	stmmac_fpe_set_add_frag_size(priv, frag_size);
-	stmmac_fpe_apply(priv);
-
-	spin_unlock_irqrestore(&fpe_cfg->lock, flags);
 
 	return 0;
 }
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_fpe.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_fpe.c
index 3a4bee029c7f..75b470ee621a 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_fpe.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_fpe.c
@@ -27,12 +27,6 @@
 #define STMMAC_MAC_FPE_CTRL_STS_SVER	BIT(1)
 #define STMMAC_MAC_FPE_CTRL_STS_EFPE	BIT(0)
 
-/* FPE link-partner hand-shaking mPacket type */
-enum stmmac_mpacket_type {
-	MPACKET_VERIFY = 0,
-	MPACKET_RESPONSE = 1,
-};
-
 struct stmmac_fpe_reg {
 	const u32 mac_fpe_reg;		/* offset of MAC_FPE_CTRL_STS */
 	const u32 mtl_fpe_reg;		/* offset of MTL_FPE_CTRL_STS */
@@ -48,10 +42,10 @@ bool stmmac_fpe_supported(struct stmmac_priv *priv)
 		priv->hw->mac->fpe_map_preemption_class;
 }
 
-static void stmmac_fpe_configure(struct stmmac_priv *priv, bool tx_enable,
-				 bool pmac_enable)
+static void stmmac_fpe_configure_tx(struct ethtool_mmsv *mmsv, bool tx_enable)
 {
-	struct stmmac_fpe_cfg *cfg = &priv->fpe_cfg;
+	struct stmmac_fpe_cfg *cfg = container_of(mmsv, struct stmmac_fpe_cfg, mmsv);
+	struct stmmac_priv *priv = container_of(cfg, struct stmmac_priv, fpe_cfg);
 	const struct stmmac_fpe_reg *reg = cfg->reg;
 	u32 num_rxq = priv->plat->rx_queues_to_use;
 	void __iomem *ioaddr = priv->ioaddr;
@@ -68,6 +62,15 @@ static void stmmac_fpe_configure(struct stmmac_priv *priv, bool tx_enable,
 		cfg->fpe_csr = 0;
 	}
 	writel(cfg->fpe_csr, ioaddr + reg->mac_fpe_reg);
+}
+
+static void stmmac_fpe_configure_pmac(struct ethtool_mmsv *mmsv, bool pmac_enable)
+{
+	struct stmmac_fpe_cfg *cfg = container_of(mmsv, struct stmmac_fpe_cfg, mmsv);
+	struct stmmac_priv *priv = container_of(cfg, struct stmmac_priv, fpe_cfg);
+	const struct stmmac_fpe_reg *reg = cfg->reg;
+	void __iomem *ioaddr = priv->ioaddr;
+	u32 value;
 
 	value = readl(ioaddr + reg->int_en_reg);
 
@@ -85,47 +88,45 @@ static void stmmac_fpe_configure(struct stmmac_priv *priv, bool tx_enable,
 	writel(value, ioaddr + reg->int_en_reg);
 }
 
-static void stmmac_fpe_send_mpacket(struct stmmac_priv *priv,
-				    enum stmmac_mpacket_type type)
+static void stmmac_fpe_send_mpacket(struct ethtool_mmsv *mmsv,
+				    enum ethtool_mpacket type)
 {
-	const struct stmmac_fpe_reg *reg = priv->fpe_cfg.reg;
+	struct stmmac_fpe_cfg *cfg = container_of(mmsv, struct stmmac_fpe_cfg, mmsv);
+	struct stmmac_priv *priv = container_of(cfg, struct stmmac_priv, fpe_cfg);
+	const struct stmmac_fpe_reg *reg = cfg->reg;
 	void __iomem *ioaddr = priv->ioaddr;
-	u32 value = priv->fpe_cfg.fpe_csr;
+	u32 value = cfg->fpe_csr;
 
-	if (type == MPACKET_VERIFY)
+	if (type == ETHTOOL_MPACKET_VERIFY)
 		value |= STMMAC_MAC_FPE_CTRL_STS_SVER;
-	else if (type == MPACKET_RESPONSE)
+	else if (type == ETHTOOL_MPACKET_RESPONSE)
 		value |= STMMAC_MAC_FPE_CTRL_STS_SRSP;
 
 	writel(value, ioaddr + reg->mac_fpe_reg);
 }
 
+static const struct ethtool_mmsv_ops stmmac_mmsv_ops = {
+	.configure_tx = stmmac_fpe_configure_tx,
+	.configure_pmac = stmmac_fpe_configure_pmac,
+	.send_mpacket = stmmac_fpe_send_mpacket,
+};
+
 static void stmmac_fpe_event_status(struct stmmac_priv *priv, int status)
 {
 	struct stmmac_fpe_cfg *fpe_cfg = &priv->fpe_cfg;
+	struct ethtool_mmsv *mmsv = &fpe_cfg->mmsv;
 
-	/* This is interrupt context, just spin_lock() */
-	spin_lock(&fpe_cfg->lock);
-
-	if (!fpe_cfg->pmac_enabled || status == FPE_EVENT_UNKNOWN)
-		goto unlock_out;
+	if (status == FPE_EVENT_UNKNOWN)
+		return;
 
-	/* LP has sent verify mPacket */
 	if ((status & FPE_EVENT_RVER) == FPE_EVENT_RVER)
-		stmmac_fpe_send_mpacket(priv, MPACKET_RESPONSE);
+		ethtool_mmsv_event_handle(mmsv, ETHTOOL_MMSV_LP_SENT_VERIFY_MPACKET);
 
-	/* Local has sent verify mPacket */
-	if ((status & FPE_EVENT_TVER) == FPE_EVENT_TVER &&
-	    fpe_cfg->status != ETHTOOL_MM_VERIFY_STATUS_SUCCEEDED)
-		fpe_cfg->status = ETHTOOL_MM_VERIFY_STATUS_VERIFYING;
+	if ((status & FPE_EVENT_TVER) == FPE_EVENT_TVER)
+		ethtool_mmsv_event_handle(mmsv, ETHTOOL_MMSV_LD_SENT_VERIFY_MPACKET);
 
-	/* LP has sent response mPacket */
-	if ((status & FPE_EVENT_RRSP) == FPE_EVENT_RRSP &&
-	    fpe_cfg->status == ETHTOOL_MM_VERIFY_STATUS_VERIFYING)
-		fpe_cfg->status = ETHTOOL_MM_VERIFY_STATUS_SUCCEEDED;
-
-unlock_out:
-	spin_unlock(&fpe_cfg->lock);
+	if ((status & FPE_EVENT_RRSP) == FPE_EVENT_RRSP)
+		ethtool_mmsv_event_handle(mmsv, ETHTOOL_MMSV_LP_SENT_RESPONSE_MPACKET);
 }
 
 void stmmac_fpe_irq_status(struct stmmac_priv *priv)
@@ -164,119 +165,16 @@ void stmmac_fpe_irq_status(struct stmmac_priv *priv)
 	stmmac_fpe_event_status(priv, status);
 }
 
-/**
- * stmmac_fpe_verify_timer - Timer for MAC Merge verification
- * @t:  timer_list struct containing private info
- *
- * Verify the MAC Merge capability in the local TX direction, by
- * transmitting Verify mPackets up to 3 times. Wait until link
- * partner responds with a Response mPacket, otherwise fail.
- */
-static void stmmac_fpe_verify_timer(struct timer_list *t)
-{
-	struct stmmac_fpe_cfg *fpe_cfg = from_timer(fpe_cfg, t, verify_timer);
-	struct stmmac_priv *priv = container_of(fpe_cfg, struct stmmac_priv,
-						fpe_cfg);
-	unsigned long flags;
-	bool rearm = false;
-
-	spin_lock_irqsave(&fpe_cfg->lock, flags);
-
-	switch (fpe_cfg->status) {
-	case ETHTOOL_MM_VERIFY_STATUS_INITIAL:
-	case ETHTOOL_MM_VERIFY_STATUS_VERIFYING:
-		if (fpe_cfg->verify_retries != 0) {
-			stmmac_fpe_send_mpacket(priv, MPACKET_VERIFY);
-			rearm = true;
-		} else {
-			fpe_cfg->status = ETHTOOL_MM_VERIFY_STATUS_FAILED;
-		}
-
-		fpe_cfg->verify_retries--;
-		break;
-
-	case ETHTOOL_MM_VERIFY_STATUS_SUCCEEDED:
-		stmmac_fpe_configure(priv, true, true);
-		break;
-
-	default:
-		break;
-	}
-
-	if (rearm) {
-		mod_timer(&fpe_cfg->verify_timer,
-			  jiffies + msecs_to_jiffies(fpe_cfg->verify_time));
-	}
-
-	spin_unlock_irqrestore(&fpe_cfg->lock, flags);
-}
-
-static void stmmac_fpe_verify_timer_arm(struct stmmac_fpe_cfg *fpe_cfg)
-{
-	if (fpe_cfg->pmac_enabled && fpe_cfg->tx_enabled &&
-	    fpe_cfg->verify_enabled &&
-	    fpe_cfg->status != ETHTOOL_MM_VERIFY_STATUS_FAILED &&
-	    fpe_cfg->status != ETHTOOL_MM_VERIFY_STATUS_SUCCEEDED) {
-		timer_setup(&fpe_cfg->verify_timer, stmmac_fpe_verify_timer, 0);
-		mod_timer(&fpe_cfg->verify_timer, jiffies);
-	}
-}
-
 void stmmac_fpe_init(struct stmmac_priv *priv)
 {
-	priv->fpe_cfg.verify_retries = STMMAC_FPE_MM_MAX_VERIFY_RETRIES;
-	priv->fpe_cfg.verify_time = STMMAC_FPE_MM_MAX_VERIFY_TIME_MS;
-	priv->fpe_cfg.status = ETHTOOL_MM_VERIFY_STATUS_DISABLED;
-	timer_setup(&priv->fpe_cfg.verify_timer, stmmac_fpe_verify_timer, 0);
-	spin_lock_init(&priv->fpe_cfg.lock);
+	ethtool_mmsv_init(&priv->fpe_cfg.mmsv, priv->dev,
+			  &stmmac_mmsv_ops);
 
 	if ((!priv->fpe_cfg.reg || !priv->hw->mac->fpe_map_preemption_class) &&
 	    priv->dma_cap.fpesel)
 		dev_info(priv->device, "FPE is not supported by driver.\n");
 }
 
-void stmmac_fpe_apply(struct stmmac_priv *priv)
-{
-	struct stmmac_fpe_cfg *fpe_cfg = &priv->fpe_cfg;
-
-	/* If verification is disabled, configure FPE right away.
-	 * Otherwise let the timer code do it.
-	 */
-	if (!fpe_cfg->verify_enabled) {
-		stmmac_fpe_configure(priv, fpe_cfg->tx_enabled,
-				     fpe_cfg->pmac_enabled);
-	} else {
-		fpe_cfg->status = ETHTOOL_MM_VERIFY_STATUS_INITIAL;
-		fpe_cfg->verify_retries = STMMAC_FPE_MM_MAX_VERIFY_RETRIES;
-
-		if (netif_running(priv->dev))
-			stmmac_fpe_verify_timer_arm(fpe_cfg);
-	}
-}
-
-void stmmac_fpe_link_state_handle(struct stmmac_priv *priv, bool is_up)
-{
-	struct stmmac_fpe_cfg *fpe_cfg = &priv->fpe_cfg;
-	unsigned long flags;
-
-	timer_shutdown_sync(&fpe_cfg->verify_timer);
-
-	spin_lock_irqsave(&fpe_cfg->lock, flags);
-
-	if (is_up && fpe_cfg->pmac_enabled) {
-		/* VERIFY process requires pmac enabled when NIC comes up */
-		stmmac_fpe_configure(priv, false, true);
-
-		/* New link => maybe new partner => new verification process */
-		stmmac_fpe_apply(priv);
-	} else {
-		/* No link => turn off EFPE */
-		stmmac_fpe_configure(priv, false, false);
-	}
-
-	spin_unlock_irqrestore(&fpe_cfg->lock, flags);
-}
-
 int stmmac_fpe_get_add_frag_size(struct stmmac_priv *priv)
 {
 	const struct stmmac_fpe_reg *reg = priv->fpe_cfg.reg;
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_fpe.h b/drivers/net/ethernet/stmicro/stmmac/stmmac_fpe.h
index b884eac7142d..3fc46acf7001 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_fpe.h
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_fpe.h
@@ -9,15 +9,10 @@
 #include <linux/types.h>
 #include <linux/netdevice.h>
 
-#define STMMAC_FPE_MM_MAX_VERIFY_RETRIES	3
-#define STMMAC_FPE_MM_MAX_VERIFY_TIME_MS	128
-
 struct stmmac_priv;
 
-void stmmac_fpe_link_state_handle(struct stmmac_priv *priv, bool is_up);
 bool stmmac_fpe_supported(struct stmmac_priv *priv);
 void stmmac_fpe_init(struct stmmac_priv *priv);
-void stmmac_fpe_apply(struct stmmac_priv *priv);
 void stmmac_fpe_irq_status(struct stmmac_priv *priv);
 int stmmac_fpe_get_add_frag_size(struct stmmac_priv *priv);
 void stmmac_fpe_set_add_frag_size(struct stmmac_priv *priv, u32 add_frag_size);
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index d04543e5697b..8f723f9e84ba 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -992,7 +992,7 @@ static void stmmac_mac_link_down(struct phylink_config *config,
 		stmmac_set_eee_pls(priv, priv->hw, false);
 
 	if (stmmac_fpe_supported(priv))
-		stmmac_fpe_link_state_handle(priv, false);
+		ethtool_mmsv_link_state_handle(&priv->fpe_cfg.mmsv, false);
 }
 
 static void stmmac_mac_link_up(struct phylink_config *config,
@@ -1100,7 +1100,7 @@ static void stmmac_mac_link_up(struct phylink_config *config,
 		stmmac_set_eee_pls(priv, priv->hw, true);
 
 	if (stmmac_fpe_supported(priv))
-		stmmac_fpe_link_state_handle(priv, true);
+		ethtool_mmsv_link_state_handle(&priv->fpe_cfg.mmsv, true);
 
 	if (priv->plat->flags & STMMAC_FLAG_HWTSTAMP_CORRECT_LATENCY)
 		stmmac_hwtstamp_correct_latency(priv, priv);
@@ -4097,7 +4097,7 @@ static int stmmac_release(struct net_device *dev)
 	stmmac_release_ptp(priv);
 
 	if (stmmac_fpe_supported(priv))
-		timer_shutdown_sync(&priv->fpe_cfg.verify_timer);
+		ethtool_mmsv_stop(&priv->fpe_cfg.mmsv);
 
 	pm_runtime_put(priv->device);
 
@@ -7822,7 +7822,7 @@ int stmmac_suspend(struct device *dev)
 	rtnl_unlock();
 
 	if (stmmac_fpe_supported(priv))
-		timer_shutdown_sync(&priv->fpe_cfg.verify_timer);
+		ethtool_mmsv_stop(&priv->fpe_cfg.mmsv);
 
 	priv->speed = SPEED_UNKNOWN;
 	return 0;
diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index 870994cc3ef7..c2511f78afd9 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -17,9 +17,13 @@
 #include <linux/compat.h>
 #include <linux/if_ether.h>
 #include <linux/netlink.h>
+#include <linux/timer_types.h>
 #include <uapi/linux/ethtool.h>
 #include <uapi/linux/net_tstamp.h>
 
+#define ETHTOOL_MM_MAX_VERIFY_TIME_MS		128
+#define ETHTOOL_MM_MAX_VERIFY_RETRIES		3
+
 struct compat_ethtool_rx_flow_spec {
 	u32		flow_type;
 	union ethtool_flow_union h_u;
@@ -710,6 +714,64 @@ struct ethtool_mm_stats {
 	u64 MACMergeHoldCount;
 };
 
+enum ethtool_mmsv_event {
+	ETHTOOL_MMSV_LP_SENT_VERIFY_MPACKET,
+	ETHTOOL_MMSV_LD_SENT_VERIFY_MPACKET,
+	ETHTOOL_MMSV_LP_SENT_RESPONSE_MPACKET,
+};
+
+/* MAC Merge verification mPacket type */
+enum ethtool_mpacket {
+	ETHTOOL_MPACKET_VERIFY,
+	ETHTOOL_MPACKET_RESPONSE,
+};
+
+struct ethtool_mmsv;
+
+struct ethtool_mmsv_ops {
+	void (*configure_tx)(struct ethtool_mmsv *mmsv, bool tx_active);
+	void (*configure_pmac)(struct ethtool_mmsv *mmsv, bool pmac_enabled);
+	void (*send_mpacket)(struct ethtool_mmsv *mmsv, enum ethtool_mpacket mpacket);
+};
+
+/**
+ * struct ethtool_mmsv - MAC Merge Software Verification
+ * @ops: operations for MAC Merge Software Verification
+ * @dev: pointer to net_device structure
+ * @lock: serialize access to MAC Merge state between
+ *	  ethtool requests and link state updates.
+ * @status: current verification FSM state
+ * @verify_timer: timer for verification in local TX direction
+ * @verify_enabled: indicates if verification is enabled
+ * @verify_retries: number of retries for verification
+ * @pmac_enabled: indicates if the preemptible MAC is enabled
+ * @verify_time: time for verification in milliseconds
+ * @tx_enabled: indicates if transmission is enabled
+ */
+struct ethtool_mmsv {
+	const struct ethtool_mmsv_ops *ops;
+	struct net_device *dev;
+	spinlock_t lock;
+	enum ethtool_mm_verify_status status;
+	struct timer_list verify_timer;
+	bool verify_enabled;
+	int verify_retries;
+	bool pmac_enabled;
+	u32 verify_time;
+	bool tx_enabled;
+};
+
+void ethtool_mmsv_stop(struct ethtool_mmsv *mmsv);
+void ethtool_mmsv_link_state_handle(struct ethtool_mmsv *mmsv, bool up);
+void ethtool_mmsv_event_handle(struct ethtool_mmsv *mmsv,
+			       enum ethtool_mmsv_event event);
+void ethtool_mmsv_get_mm(struct ethtool_mmsv *mmsv,
+			 struct ethtool_mm_state *state);
+void ethtool_mmsv_set_mm(struct ethtool_mmsv *mmsv, struct ethtool_mm_cfg *cfg);
+void ethtool_mmsv_init(struct ethtool_mmsv *mmsv, struct net_device *dev,
+		       const struct ethtool_mmsv_ops *ops);
+bool ethtool_mmsv_is_tx_active(struct ethtool_mmsv *mmsv);
+
 /**
  * struct ethtool_rxfh_param - RXFH (RSS) parameters
  * @hfunc: Defines the current RSS hash function used by HW (or to be set to).
diff --git a/net/ethtool/mm.c b/net/ethtool/mm.c
index 2816bb23c3ad..2ebb811948ec 100644
--- a/net/ethtool/mm.c
+++ b/net/ethtool/mm.c
@@ -282,3 +282,226 @@ bool ethtool_dev_mm_supported(struct net_device *dev)
 	return supported;
 }
 EXPORT_SYMBOL_GPL(ethtool_dev_mm_supported);
+
+static void ethtool_mmsv_configure_tx(struct ethtool_mmsv *mmsv,
+				      bool tx_active)
+{
+	if (mmsv->ops->configure_tx)
+		mmsv->ops->configure_tx(mmsv, tx_active);
+}
+
+static void ethtool_mmsv_configure_pmac(struct ethtool_mmsv *mmsv,
+					bool pmac_enabled)
+{
+	if (mmsv->ops->configure_pmac)
+		mmsv->ops->configure_pmac(mmsv, pmac_enabled);
+}
+
+static void ethtool_mmsv_send_mpacket(struct ethtool_mmsv *mmsv,
+				      enum ethtool_mpacket mpacket)
+{
+	if (mmsv->ops->send_mpacket)
+		mmsv->ops->send_mpacket(mmsv, mpacket);
+}
+
+/**
+ * ethtool_mmsv_verify_timer - Timer for MAC Merge verification
+ * @t: timer_list struct containing private info
+ *
+ * Verify the MAC Merge capability in the local TX direction, by
+ * transmitting Verify mPackets up to 3 times. Wait until link
+ * partner responds with a Response mPacket, otherwise fail.
+ */
+static void ethtool_mmsv_verify_timer(struct timer_list *t)
+{
+	struct ethtool_mmsv *mmsv = from_timer(mmsv, t, verify_timer);
+	unsigned long flags;
+	bool rearm = false;
+
+	spin_lock_irqsave(&mmsv->lock, flags);
+
+	switch (mmsv->status) {
+	case ETHTOOL_MM_VERIFY_STATUS_INITIAL:
+	case ETHTOOL_MM_VERIFY_STATUS_VERIFYING:
+		if (mmsv->verify_retries != 0) {
+			ethtool_mmsv_send_mpacket(mmsv, ETHTOOL_MPACKET_VERIFY);
+			rearm = true;
+		} else {
+			mmsv->status = ETHTOOL_MM_VERIFY_STATUS_FAILED;
+		}
+
+		mmsv->verify_retries--;
+		break;
+
+	case ETHTOOL_MM_VERIFY_STATUS_SUCCEEDED:
+		ethtool_mmsv_configure_tx(mmsv, true);
+		break;
+
+	default:
+		break;
+	}
+
+	if (rearm) {
+		mod_timer(&mmsv->verify_timer,
+			  jiffies + msecs_to_jiffies(mmsv->verify_time));
+	}
+
+	spin_unlock_irqrestore(&mmsv->lock, flags);
+}
+
+static void ethtool_mmsv_verify_timer_arm(struct ethtool_mmsv *mmsv)
+{
+	if (mmsv->pmac_enabled && mmsv->tx_enabled && mmsv->verify_enabled &&
+	    mmsv->status != ETHTOOL_MM_VERIFY_STATUS_FAILED &&
+	    mmsv->status != ETHTOOL_MM_VERIFY_STATUS_SUCCEEDED) {
+		timer_setup(&mmsv->verify_timer, ethtool_mmsv_verify_timer, 0);
+		mod_timer(&mmsv->verify_timer, jiffies);
+	}
+}
+
+static void ethtool_mmsv_apply(struct ethtool_mmsv *mmsv)
+{
+	/* If verification is disabled, configure FPE right away.
+	 * Otherwise let the timer code do it.
+	 */
+	if (!mmsv->verify_enabled) {
+		ethtool_mmsv_configure_pmac(mmsv, mmsv->pmac_enabled);
+		ethtool_mmsv_configure_tx(mmsv, mmsv->tx_enabled);
+	} else {
+		mmsv->status = ETHTOOL_MM_VERIFY_STATUS_INITIAL;
+		mmsv->verify_retries = ETHTOOL_MM_MAX_VERIFY_RETRIES;
+
+		if (netif_running(mmsv->dev))
+			ethtool_mmsv_verify_timer_arm(mmsv);
+	}
+}
+
+void ethtool_mmsv_stop(struct ethtool_mmsv *mmsv)
+{
+	timer_shutdown_sync(&mmsv->verify_timer);
+}
+EXPORT_SYMBOL_GPL(ethtool_mmsv_stop);
+
+void ethtool_mmsv_link_state_handle(struct ethtool_mmsv *mmsv, bool up)
+{
+	unsigned long flags;
+
+	ethtool_mmsv_stop(mmsv);
+
+	spin_lock_irqsave(&mmsv->lock, flags);
+
+	if (up && mmsv->pmac_enabled) {
+		/* VERIFY process requires pMAC enabled when NIC comes up */
+		ethtool_mmsv_configure_pmac(mmsv, true);
+
+		/* New link => maybe new partner => new verification process */
+		ethtool_mmsv_apply(mmsv);
+	} else {
+		mmsv->status = ETHTOOL_MM_VERIFY_STATUS_INITIAL;
+		mmsv->verify_retries = ETHTOOL_MM_MAX_VERIFY_RETRIES;
+
+		/* No link or pMAC not enabled */
+		ethtool_mmsv_configure_pmac(mmsv, false);
+		ethtool_mmsv_configure_tx(mmsv, false);
+	}
+
+	spin_unlock_irqrestore(&mmsv->lock, flags);
+}
+EXPORT_SYMBOL_GPL(ethtool_mmsv_link_state_handle);
+
+void ethtool_mmsv_event_handle(struct ethtool_mmsv *mmsv,
+			       enum ethtool_mmsv_event event)
+{
+	/* This is interrupt context, just spin_lock() */
+	spin_lock(&mmsv->lock);
+
+	if (!mmsv->pmac_enabled)
+		goto unlock;
+
+	switch (event) {
+	case ETHTOOL_MMSV_LP_SENT_VERIFY_MPACKET:
+		/* Link partner has sent verify mPacket */
+		ethtool_mmsv_send_mpacket(mmsv, ETHTOOL_MPACKET_RESPONSE);
+		break;
+	case ETHTOOL_MMSV_LD_SENT_VERIFY_MPACKET:
+		/* Local device has sent verify mPacket */
+		if (mmsv->status != ETHTOOL_MM_VERIFY_STATUS_SUCCEEDED)
+			mmsv->status = ETHTOOL_MM_VERIFY_STATUS_VERIFYING;
+		break;
+	case ETHTOOL_MMSV_LP_SENT_RESPONSE_MPACKET:
+		/* Link partner has sent response mPacket */
+		if (mmsv->status == ETHTOOL_MM_VERIFY_STATUS_VERIFYING)
+			mmsv->status = ETHTOOL_MM_VERIFY_STATUS_SUCCEEDED;
+		break;
+	}
+
+unlock:
+	spin_unlock(&mmsv->lock);
+}
+EXPORT_SYMBOL_GPL(ethtool_mmsv_event_handle);
+
+bool ethtool_mmsv_is_tx_active(struct ethtool_mmsv *mmsv)
+{
+	/* TX is active if administratively enabled, and verification either
+	 * succeeded, or was administratively disabled.
+	 */
+	return mmsv->tx_enabled &&
+	       (mmsv->status == ETHTOOL_MM_VERIFY_STATUS_SUCCEEDED ||
+	       mmsv->status == ETHTOOL_MM_VERIFY_STATUS_DISABLED);
+}
+EXPORT_SYMBOL_GPL(ethtool_mmsv_is_tx_active);
+
+void ethtool_mmsv_get_mm(struct ethtool_mmsv *mmsv,
+			 struct ethtool_mm_state *state)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&mmsv->lock, flags);
+
+	state->max_verify_time = ETHTOOL_MM_MAX_VERIFY_TIME_MS;
+	state->verify_enabled = mmsv->verify_enabled;
+	state->pmac_enabled = mmsv->pmac_enabled;
+	state->verify_time = mmsv->verify_time;
+	state->tx_enabled = mmsv->tx_enabled;
+	state->verify_status = mmsv->status;
+	state->tx_active = ethtool_mmsv_is_tx_active(mmsv);
+
+	spin_unlock_irqrestore(&mmsv->lock, flags);
+}
+EXPORT_SYMBOL_GPL(ethtool_mmsv_get_mm);
+
+void ethtool_mmsv_set_mm(struct ethtool_mmsv *mmsv, struct ethtool_mm_cfg *cfg)
+{
+	unsigned long flags;
+
+	/* Wait for the verification that's currently in progress to finish */
+	ethtool_mmsv_stop(mmsv);
+
+	spin_lock_irqsave(&mmsv->lock, flags);
+
+	mmsv->verify_enabled = cfg->verify_enabled;
+	mmsv->pmac_enabled = cfg->pmac_enabled;
+	mmsv->verify_time = cfg->verify_time;
+	mmsv->tx_enabled = cfg->tx_enabled;
+
+	if (!cfg->verify_enabled)
+		mmsv->status = ETHTOOL_MM_VERIFY_STATUS_DISABLED;
+
+	ethtool_mmsv_apply(mmsv);
+
+	spin_unlock_irqrestore(&mmsv->lock, flags);
+}
+EXPORT_SYMBOL_GPL(ethtool_mmsv_set_mm);
+
+void ethtool_mmsv_init(struct ethtool_mmsv *mmsv, struct net_device *dev,
+		       const struct ethtool_mmsv_ops *ops)
+{
+	mmsv->ops = ops;
+	mmsv->dev = dev;
+	mmsv->verify_retries = ETHTOOL_MM_MAX_VERIFY_RETRIES;
+	mmsv->verify_time = ETHTOOL_MM_MAX_VERIFY_TIME_MS;
+	mmsv->status = ETHTOOL_MM_VERIFY_STATUS_DISABLED;
+	timer_setup(&mmsv->verify_timer, ethtool_mmsv_verify_timer, 0);
+	spin_lock_init(&mmsv->lock);
+}
+EXPORT_SYMBOL_GPL(ethtool_mmsv_init);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH iwl-next v2 2/9] igc: Rename xdp_get_tx_ring() for non-xdp usage
  2025-02-05 10:05 [PATCH iwl-next v2 0/9] igc: Add support for Frame Preemption feature in IGC Faizal Rahim
  2025-02-05 10:05 ` [PATCH iwl-next v2 1/9] net: ethtool: mm: extract stmmac verification logic into common library Faizal Rahim
@ 2025-02-05 10:05 ` Faizal Rahim
  2025-02-05 10:05 ` [PATCH iwl-next v2 3/9] igc: Optimize the TX packet buffer utilization Faizal Rahim
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: Faizal Rahim @ 2025-02-05 10:05 UTC (permalink / raw)
  To: Tony Nguyen, Przemek Kitszel, Andrew Lunn, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Maxime Coquelin,
	Alexandre Torgue, Simon Horman, Russell King, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Russell King, Furong Xu, Vladimir Oltean, Serge Semin,
	Xiaolei Wang, Suraj Jaiswal, Kory Maincent, Petr Tesarik,
	Faizal Rahim, Choong Yong Liang, Kunihiko Hayashi,
	Vinicius Costa Gomes, intel-wired-lan, netdev, linux-kernel,
	linux-stm32, linux-arm-kernel, bpf

Renamed xdp_get_tx_ring() function to a more generic name for use in
upcoming frame preemption patches.

Signed-off-by: Faizal Rahim <faizal.abdul.rahim@linux.intel.com>
---
 drivers/net/ethernet/intel/igc/igc.h      |  2 +-
 drivers/net/ethernet/intel/igc/igc_main.c | 10 +++++-----
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/intel/igc/igc.h b/drivers/net/ethernet/intel/igc/igc.h
index b8111ad9a9a8..22ecdac26cf4 100644
--- a/drivers/net/ethernet/intel/igc/igc.h
+++ b/drivers/net/ethernet/intel/igc/igc.h
@@ -736,7 +736,7 @@ struct igc_nfc_rule *igc_get_nfc_rule(struct igc_adapter *adapter,
 				      u32 location);
 int igc_add_nfc_rule(struct igc_adapter *adapter, struct igc_nfc_rule *rule);
 void igc_del_nfc_rule(struct igc_adapter *adapter, struct igc_nfc_rule *rule);
-
+struct igc_ring *igc_get_tx_ring(struct igc_adapter *adapter, int cpu);
 void igc_ptp_init(struct igc_adapter *adapter);
 void igc_ptp_reset(struct igc_adapter *adapter);
 void igc_ptp_suspend(struct igc_adapter *adapter);
diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index 56a35d58e7a6..44e4f925491f 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -2444,8 +2444,8 @@ static int igc_xdp_init_tx_descriptor(struct igc_ring *ring,
 	return -ENOMEM;
 }
 
-static struct igc_ring *igc_xdp_get_tx_ring(struct igc_adapter *adapter,
-					    int cpu)
+struct igc_ring *igc_get_tx_ring(struct igc_adapter *adapter,
+				 int cpu)
 {
 	int index = cpu;
 
@@ -2469,7 +2469,7 @@ static int igc_xdp_xmit_back(struct igc_adapter *adapter, struct xdp_buff *xdp)
 	if (unlikely(!xdpf))
 		return -EFAULT;
 
-	ring = igc_xdp_get_tx_ring(adapter, cpu);
+	ring = igc_get_tx_ring(adapter, cpu);
 	nq = txring_txq(ring);
 
 	__netif_tx_lock(nq, cpu);
@@ -2546,7 +2546,7 @@ static void igc_finalize_xdp(struct igc_adapter *adapter, int status)
 	struct igc_ring *ring;
 
 	if (status & IGC_XDP_TX) {
-		ring = igc_xdp_get_tx_ring(adapter, cpu);
+		ring = igc_get_tx_ring(adapter, cpu);
 		nq = txring_txq(ring);
 
 		__netif_tx_lock(nq, cpu);
@@ -6699,7 +6699,7 @@ static int igc_xdp_xmit(struct net_device *dev, int num_frames,
 	if (unlikely(flags & ~XDP_XMIT_FLAGS_MASK))
 		return -EINVAL;
 
-	ring = igc_xdp_get_tx_ring(adapter, cpu);
+	ring = igc_get_tx_ring(adapter, cpu);
 	nq = txring_txq(ring);
 
 	__netif_tx_lock(nq, cpu);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH iwl-next v2 3/9] igc: Optimize the TX packet buffer utilization
  2025-02-05 10:05 [PATCH iwl-next v2 0/9] igc: Add support for Frame Preemption feature in IGC Faizal Rahim
  2025-02-05 10:05 ` [PATCH iwl-next v2 1/9] net: ethtool: mm: extract stmmac verification logic into common library Faizal Rahim
  2025-02-05 10:05 ` [PATCH iwl-next v2 2/9] igc: Rename xdp_get_tx_ring() for non-xdp usage Faizal Rahim
@ 2025-02-05 10:05 ` Faizal Rahim
  2025-02-05 10:05 ` [PATCH iwl-next v2 4/9] igc: Set the RX packet buffer size for TSN mode Faizal Rahim
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: Faizal Rahim @ 2025-02-05 10:05 UTC (permalink / raw)
  To: Tony Nguyen, Przemek Kitszel, Andrew Lunn, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Maxime Coquelin,
	Alexandre Torgue, Simon Horman, Russell King, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Russell King, Furong Xu, Vladimir Oltean, Serge Semin,
	Xiaolei Wang, Suraj Jaiswal, Kory Maincent, Petr Tesarik,
	Faizal Rahim, Choong Yong Liang, Kunihiko Hayashi,
	Vinicius Costa Gomes, intel-wired-lan, netdev, linux-kernel,
	linux-stm32, linux-arm-kernel, bpf

Packet buffers (RX + TX) total 64KB. Neither RX or TX buffers can be
larger than 34KB. So divide the buffer equally, 32KB for each.

Co-developed-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Faizal Rahim <faizal.abdul.rahim@linux.intel.com>
---
 drivers/net/ethernet/intel/igc/igc_defines.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/igc/igc_defines.h b/drivers/net/ethernet/intel/igc/igc_defines.h
index 8e449904aa7d..516ef70c98e9 100644
--- a/drivers/net/ethernet/intel/igc/igc_defines.h
+++ b/drivers/net/ethernet/intel/igc/igc_defines.h
@@ -400,7 +400,8 @@
 #define I225_TXPBSIZE_DEFAULT	0x04000014 /* TXPBSIZE default */
 #define IGC_RXPBS_CFG_TS_EN	0x80000000 /* Timestamp in Rx buffer */
 
-#define IGC_TXPBSIZE_TSN	0x04145145 /* 5k bytes buffer for each queue */
+ /* 7KB bytes buffer for each tx queue (total 4 queues) + 4KB for BMC*/
+#define IGC_TXPBSIZE_TSN	0x041c71c7
 
 #define IGC_DTXMXPKTSZ_TSN	0x19 /* 1600 bytes of max TX DMA packet size */
 #define IGC_DTXMXPKTSZ_DEFAULT	0x98 /* 9728-byte Jumbo frames */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH iwl-next v2 4/9] igc: Set the RX packet buffer size for TSN mode
  2025-02-05 10:05 [PATCH iwl-next v2 0/9] igc: Add support for Frame Preemption feature in IGC Faizal Rahim
                   ` (2 preceding siblings ...)
  2025-02-05 10:05 ` [PATCH iwl-next v2 3/9] igc: Optimize the TX packet buffer utilization Faizal Rahim
@ 2025-02-05 10:05 ` Faizal Rahim
  2025-02-05 10:05 ` [PATCH iwl-next v2 5/9] igc: Add support for frame preemption verification Faizal Rahim
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: Faizal Rahim @ 2025-02-05 10:05 UTC (permalink / raw)
  To: Tony Nguyen, Przemek Kitszel, Andrew Lunn, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Maxime Coquelin,
	Alexandre Torgue, Simon Horman, Russell King, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Russell King, Furong Xu, Vladimir Oltean, Serge Semin,
	Xiaolei Wang, Suraj Jaiswal, Kory Maincent, Petr Tesarik,
	Faizal Rahim, Choong Yong Liang, Kunihiko Hayashi,
	Vinicius Costa Gomes, intel-wired-lan, netdev, linux-kernel,
	linux-stm32, linux-arm-kernel, bpf

In preparation for supporting frame preemption, when entering TSN mode
set the receive packet buffer to 16KB for the Express MAC, 16KB for
the Preemptible MAC and 2KB for the BMC, according to the datasheet
section 7.1.3.2.

Co-developed-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Faizal Rahim <faizal.abdul.rahim@linux.intel.com>
---
 drivers/net/ethernet/intel/igc/igc_defines.h |  3 +++
 drivers/net/ethernet/intel/igc/igc_tsn.c     | 13 +++++++++++--
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/igc/igc_defines.h b/drivers/net/ethernet/intel/igc/igc_defines.h
index 516ef70c98e9..b19ac6f30dac 100644
--- a/drivers/net/ethernet/intel/igc/igc_defines.h
+++ b/drivers/net/ethernet/intel/igc/igc_defines.h
@@ -402,6 +402,9 @@
 
  /* 7KB bytes buffer for each tx queue (total 4 queues) + 4KB for BMC*/
 #define IGC_TXPBSIZE_TSN	0x041c71c7
+/* 15KB for EXP + 15KB for BE + 2KB for BMC */
+#define IGC_RXPBSIZE_TSN	0x0000f08f
+#define IGC_RXPBSIZE_SIZE_MASK	0x0001FFFF
 
 #define IGC_DTXMXPKTSZ_TSN	0x19 /* 1600 bytes of max TX DMA packet size */
 #define IGC_DTXMXPKTSZ_DEFAULT	0x98 /* 9728-byte Jumbo frames */
diff --git a/drivers/net/ethernet/intel/igc/igc_tsn.c b/drivers/net/ethernet/intel/igc/igc_tsn.c
index 1e44374ca1ff..f0213cfce07d 100644
--- a/drivers/net/ethernet/intel/igc/igc_tsn.c
+++ b/drivers/net/ethernet/intel/igc/igc_tsn.c
@@ -132,13 +132,17 @@ static int igc_tsn_disable_offload(struct igc_adapter *adapter)
 {
 	u16 queue_per_tc[4] = { 3, 2, 1, 0 };
 	struct igc_hw *hw = &adapter->hw;
-	u32 tqavctrl;
+	u32 tqavctrl, rxpbs;
 	int i;
 
 	wr32(IGC_GTXOFFSET, 0);
 	wr32(IGC_TXPBS, I225_TXPBSIZE_DEFAULT);
 	wr32(IGC_DTXMXPKTSZ, IGC_DTXMXPKTSZ_DEFAULT);
 
+	rxpbs = rd32(IGC_RXPBS) & ~IGC_RXPBSIZE_SIZE_MASK;
+	rxpbs |= I225_RXPBSIZE_DEFAULT;
+	wr32(IGC_RXPBS, rxpbs);
+
 	if (igc_is_device_id_i226(hw))
 		igc_tsn_restore_retx_default(adapter);
 
@@ -194,7 +198,7 @@ static int igc_tsn_enable_offload(struct igc_adapter *adapter)
 {
 	struct igc_hw *hw = &adapter->hw;
 	u32 tqavctrl, baset_l, baset_h;
-	u32 sec, nsec, cycle;
+	u32 sec, nsec, cycle, rxpbs;
 	ktime_t base_time, systim;
 	int i;
 
@@ -202,6 +206,11 @@ static int igc_tsn_enable_offload(struct igc_adapter *adapter)
 	wr32(IGC_DTXMXPKTSZ, IGC_DTXMXPKTSZ_TSN);
 	wr32(IGC_TXPBS, IGC_TXPBSIZE_TSN);
 
+	rxpbs = rd32(IGC_RXPBS) & ~IGC_RXPBSIZE_SIZE_MASK;
+	rxpbs |= IGC_RXPBSIZE_TSN;
+
+	wr32(IGC_RXPBS, rxpbs);
+
 	if (igc_is_device_id_i226(hw))
 		igc_tsn_set_retx_qbvfullthreshold(adapter);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH iwl-next v2 5/9] igc: Add support for frame preemption verification
  2025-02-05 10:05 [PATCH iwl-next v2 0/9] igc: Add support for Frame Preemption feature in IGC Faizal Rahim
                   ` (3 preceding siblings ...)
  2025-02-05 10:05 ` [PATCH iwl-next v2 4/9] igc: Set the RX packet buffer size for TSN mode Faizal Rahim
@ 2025-02-05 10:05 ` Faizal Rahim
  2025-02-05 17:12   ` Vladimir Oltean
  2025-02-05 10:05 ` [PATCH iwl-next v2 6/9] igc: Add support to set tx-min-frag-size Faizal Rahim
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 14+ messages in thread
From: Faizal Rahim @ 2025-02-05 10:05 UTC (permalink / raw)
  To: Tony Nguyen, Przemek Kitszel, Andrew Lunn, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Maxime Coquelin,
	Alexandre Torgue, Simon Horman, Russell King, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Russell King, Furong Xu, Vladimir Oltean, Serge Semin,
	Xiaolei Wang, Suraj Jaiswal, Kory Maincent, Petr Tesarik,
	Faizal Rahim, Choong Yong Liang, Kunihiko Hayashi,
	Vinicius Costa Gomes, intel-wired-lan, netdev, linux-kernel,
	linux-stm32, linux-arm-kernel, bpf

This patch implements the "ethtool --set-mm" callback to trigger the
frame preemption verification handshake.

Uses the MAC Merge Software Verification (mmsv) mechanism in ethtool
to perform the verification handshake for igc.
The structure fpe.mmsv is set by mmsv in ethtool and should remain
read-only for the driver.

igc does not use two mmsv callbacks:
a) configure_tx()
   - igc lacks registers to configure FPE in the transmit direction.

b) configure_pmac()
   - this callback dynamically controls pmac_enabled at runtime. For
     example, mmsv calls configure_pmac() and disables pmac_enabled when
     the link partner goes down, even if the user previously enabled it.
     The intention is to save power but it is not feasible in igc
     because it causes an endless adapter reset loop:

	 1) Board A and Board B complete the verification handshake.
            Tx mode register for both boards are in TSN mode.
	 2) Board B link goes down.

         On Board A:
	 3) mmsv calls configure_pmac() with pmac_enabled = false.
	 4) configure_pmac() in igc updates a new field based on
            pmac_enabled. Driver uses this field in igc_tsn_new_flags()
            to indicate that the user enabled/disabled FPE.
	 5) configure_pmac() in igc calls igc_tsn_offload_apply() to check
            whether an adapter reset is needed. Calls existing logic in
            igc_tsn_will_tx_mode_change() and igc_tsn_new_flags().
	 6) Since pmac_enabled is now disabled and no other TSN feature
            is active, igc_tsn_will_tx_mode_change() evaluates to true
            because Tx mode will switch from TSN to Legacy.
	 7) Driver resets the adapter.
	 8) Registers are set, and Tx mode switches to Legacy.
	 9) When link partner is up, steps 3–8 repeat, but this time
            with pmac_enabled = true, reactivating TSN.
            igc_tsn_will_tx_mode_change() evaluates to true again,
            since Tx mode will switch from Legacy to TSN.
	10) Driver resets the adapter.
	11) Rest adapter completes, registers are set, and Tx mode
	    switches to TSN.

        On Board B:
	12) Adapter reset on Board A at step 10 causes it to detect its
            link partner as down.
	13) Repeats steps 3–8.
	14) Once reset adapter on Board A is completed at step 11, it
            detects its link partner as up.
	15) Repeats steps 9–11.

   - this cycle repeats indefinitely. To avoid this issue, IGC only uses
     mmsv.pmac_enabled to track whether FPE is enabled or disabled.

Co-developed-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Co-developed-by: Choong Yong Liang <yong.liang.choong@linux.intel.com>
Signed-off-by: Choong Yong Liang <yong.liang.choong@linux.intel.com>
Signed-off-by: Faizal Rahim <faizal.abdul.rahim@linux.intel.com>
---
 drivers/net/ethernet/intel/igc/igc.h         |  12 +-
 drivers/net/ethernet/intel/igc/igc_defines.h |  10 +-
 drivers/net/ethernet/intel/igc/igc_ethtool.c |  19 +++
 drivers/net/ethernet/intel/igc/igc_main.c    |  50 +++++-
 drivers/net/ethernet/intel/igc/igc_tsn.c     | 160 ++++++++++++++++++-
 drivers/net/ethernet/intel/igc/igc_tsn.h     |  33 ++++
 6 files changed, 279 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/intel/igc/igc.h b/drivers/net/ethernet/intel/igc/igc.h
index 22ecdac26cf4..705bd4739e3b 100644
--- a/drivers/net/ethernet/intel/igc/igc.h
+++ b/drivers/net/ethernet/intel/igc/igc.h
@@ -40,6 +40,10 @@ void igc_ethtool_set_ops(struct net_device *);
 
 #define IGC_MAX_TX_TSTAMP_REGS		4
 
+struct fpe_t {
+	struct ethtool_mmsv mmsv;
+};
+
 enum igc_mac_filter_type {
 	IGC_MAC_FILTER_TYPE_DST = 0,
 	IGC_MAC_FILTER_TYPE_SRC
@@ -332,6 +336,8 @@ struct igc_adapter {
 		struct timespec64 period;
 	} perout[IGC_N_PEROUT];
 
+	struct fpe_t fpe;
+
 	/* LEDs */
 	struct mutex led_mutex;
 	struct igc_led_classdev *leds;
@@ -389,10 +395,11 @@ extern char igc_driver_name[];
 #define IGC_FLAG_TSN_QBV_ENABLED	BIT(17)
 #define IGC_FLAG_TSN_QAV_ENABLED	BIT(18)
 #define IGC_FLAG_TSN_LEGACY_ENABLED	BIT(19)
+#define IGC_FLAG_TSN_PREEMPT_ENABLED	BIT(20)
 
 #define IGC_FLAG_TSN_ANY_ENABLED				\
 	(IGC_FLAG_TSN_QBV_ENABLED | IGC_FLAG_TSN_QAV_ENABLED |	\
-	 IGC_FLAG_TSN_LEGACY_ENABLED)
+	 IGC_FLAG_TSN_LEGACY_ENABLED | IGC_FLAG_TSN_PREEMPT_ENABLED)
 
 #define IGC_FLAG_RSS_FIELD_IPV4_UDP	BIT(6)
 #define IGC_FLAG_RSS_FIELD_IPV6_UDP	BIT(7)
@@ -736,7 +743,10 @@ struct igc_nfc_rule *igc_get_nfc_rule(struct igc_adapter *adapter,
 				      u32 location);
 int igc_add_nfc_rule(struct igc_adapter *adapter, struct igc_nfc_rule *rule);
 void igc_del_nfc_rule(struct igc_adapter *adapter, struct igc_nfc_rule *rule);
+void igc_disable_empty_addr_recv(struct igc_adapter *adapter);
+int igc_enable_empty_addr_recv(struct igc_adapter *adapter);
 struct igc_ring *igc_get_tx_ring(struct igc_adapter *adapter, int cpu);
+void igc_flush_tx_descriptors(struct igc_ring *ring);
 void igc_ptp_init(struct igc_adapter *adapter);
 void igc_ptp_reset(struct igc_adapter *adapter);
 void igc_ptp_suspend(struct igc_adapter *adapter);
diff --git a/drivers/net/ethernet/intel/igc/igc_defines.h b/drivers/net/ethernet/intel/igc/igc_defines.h
index b19ac6f30dac..563b154c3e94 100644
--- a/drivers/net/ethernet/intel/igc/igc_defines.h
+++ b/drivers/net/ethernet/intel/igc/igc_defines.h
@@ -308,6 +308,10 @@
 #define IGC_TXD_DTYP_C		0x00000000 /* Context Descriptor */
 #define IGC_TXD_POPTS_IXSM	0x01       /* Insert IP checksum */
 #define IGC_TXD_POPTS_TXSM	0x02       /* Insert TCP/UDP checksum */
+#define IGC_TXD_POPTS_SMD_V	0x01       /* Transmitted packet is a SMD-Verify */
+#define IGC_TXD_POPTS_SMD_R	0x02       /* Transmitted packet is a SMD-Response */
+#define IGC_TXD_POPTS_SMD_SHIFT	12
+#define IGC_TXD_POPTS_SMD_MASK	0x3000
 #define IGC_TXD_CMD_EOP		0x01000000 /* End of Packet */
 #define IGC_TXD_CMD_IC		0x04000000 /* Insert Checksum */
 #define IGC_TXD_CMD_DEXT	0x20000000 /* Desc extension (0 = legacy) */
@@ -363,6 +367,8 @@
 #define IGC_SRRCTL_TIMER0SEL(timer)	(((timer) & 0x3) << 17)
 
 /* Receive Descriptor bit definitions */
+#define IGC_RXD_STAT_SMD_TYPE_V	0x01	/* SMD-V Packet */
+#define IGC_RXD_STAT_SMD_TYPE_R	0x02	/* SMD-R Packet */
 #define IGC_RXD_STAT_EOP	0x02	/* End of Packet */
 #define IGC_RXD_STAT_IXSM	0x04	/* Ignore checksum */
 #define IGC_RXD_STAT_UDPCS	0x10	/* UDP xsum calculated */
@@ -372,7 +378,8 @@
 #define IGC_RXDEXT_STATERR_LB	0x00040000
 
 /* Advanced Receive Descriptor bit definitions */
-#define IGC_RXDADV_STAT_TSIP	0x08000 /* timestamp in packet */
+#define IGC_RXDADV_STAT_SMD_TYPE_MASK	0x06000
+#define IGC_RXDADV_STAT_TSIP		0x08000 /* timestamp in packet */
 
 #define IGC_RXDEXT_STATERR_L4E		0x20000000
 #define IGC_RXDEXT_STATERR_IPE		0x40000000
@@ -543,6 +550,7 @@
 
 /* Transmit Scheduling */
 #define IGC_TQAVCTRL_TRANSMIT_MODE_TSN	0x00000001
+#define IGC_TQAVCTRL_PREEMPT_ENA	0x00000002
 #define IGC_TQAVCTRL_ENHANCED_QAV	0x00000008
 #define IGC_TQAVCTRL_FUTSCDDIS		0x00000080
 
diff --git a/drivers/net/ethernet/intel/igc/igc_ethtool.c b/drivers/net/ethernet/intel/igc/igc_ethtool.c
index 817838677817..41cb03816f92 100644
--- a/drivers/net/ethernet/intel/igc/igc_ethtool.c
+++ b/drivers/net/ethernet/intel/igc/igc_ethtool.c
@@ -8,6 +8,7 @@
 
 #include "igc.h"
 #include "igc_diag.h"
+#include "igc_tsn.h"
 
 /* forward declaration */
 struct igc_stats {
@@ -1781,6 +1782,23 @@ static int igc_ethtool_set_eee(struct net_device *netdev,
 	return 0;
 }
 
+static int igc_ethtool_set_mm(struct net_device *netdev,
+			      struct ethtool_mm_cfg *cmd,
+			      struct netlink_ext_ack *extack)
+{
+	struct igc_adapter *adapter = netdev_priv(netdev);
+	struct fpe_t *fpe = &adapter->fpe;
+
+	ethtool_mmsv_set_mm(&fpe->mmsv, cmd);
+
+	if (fpe->mmsv.pmac_enabled)
+		static_branch_enable(&igc_fpe_enabled);
+	else
+		static_branch_disable(&igc_fpe_enabled);
+
+	return igc_tsn_offload_apply(adapter);
+}
+
 static int igc_ethtool_get_link_ksettings(struct net_device *netdev,
 					  struct ethtool_link_ksettings *cmd)
 {
@@ -2068,6 +2086,7 @@ static const struct ethtool_ops igc_ethtool_ops = {
 	.set_rxfh		= igc_ethtool_set_rxfh,
 	.get_ts_info		= igc_ethtool_get_ts_info,
 	.get_channels		= igc_ethtool_get_channels,
+	.set_mm			= igc_ethtool_set_mm,
 	.set_channels		= igc_ethtool_set_channels,
 	.get_priv_flags		= igc_ethtool_get_priv_flags,
 	.set_priv_flags		= igc_ethtool_set_priv_flags,
diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index 44e4f925491f..396410522e01 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -2529,7 +2529,7 @@ static int igc_xdp_run_prog(struct igc_adapter *adapter, struct xdp_buff *xdp)
 }
 
 /* This function assumes __netif_tx_lock is held by the caller. */
-static void igc_flush_tx_descriptors(struct igc_ring *ring)
+void igc_flush_tx_descriptors(struct igc_ring *ring)
 {
 	/* Once tail pointer is updated, hardware can fetch the descriptors
 	 * any time so we issue a write membar here to ensure all memory
@@ -2618,6 +2618,15 @@ static int igc_clean_rx_irq(struct igc_q_vector *q_vector, const int budget)
 			size -= IGC_TS_HDR_LEN;
 		}
 
+		if (static_branch_unlikely(&igc_fpe_enabled) &&
+		    igc_fpe_is_verify_or_response(rx_desc, size)) {
+			igc_fpe_lp_event_status(rx_desc, &adapter->fpe.mmsv);
+			/* Advance the ring next-to-clean */
+			igc_is_non_eop(rx_ring, rx_desc);
+			cleaned_count++;
+			continue;
+		}
+
 		if (!skb) {
 			xdp_init_buff(&ctx.xdp, truesize, &rx_ring->xdp_rxq);
 			xdp_prepare_buff(&ctx.xdp, pktbuf - igc_rx_offset(rx_ring),
@@ -3065,6 +3074,11 @@ static bool igc_clean_tx_irq(struct igc_q_vector *q_vector, int napi_budget)
 		if (!(eop_desc->wb.status & cpu_to_le32(IGC_TXD_STAT_DD)))
 			break;
 
+		if (static_branch_unlikely(&igc_fpe_enabled) &&
+		    igc_fpe_transmitted_smd_v(tx_desc))
+			ethtool_mmsv_event_handle(&adapter->fpe.mmsv,
+						  ETHTOOL_MMSV_LD_SENT_VERIFY_MPACKET);
+
 		/* Hold the completions while there's a pending tx hardware
 		 * timestamp request from XDP Tx metadata.
 		 */
@@ -3956,6 +3970,30 @@ static int igc_uc_unsync(struct net_device *netdev, const unsigned char *addr)
 	return 0;
 }
 
+/**
+ * igc_enable_empty_addr_recv - Enable rx of packets with all-zeroes MAC address
+ * @adapter: Pointer to the igc_adapter structure.
+ *
+ * Frame preemption verification requires that packets with the all-zeroes
+ * MAC address are allowed to be received by IGC. This function adds the
+ * all-zeroes destination address to the list of acceptable addresses.
+ *
+ * Return: 0 on success, negative value otherwise.
+ */
+int igc_enable_empty_addr_recv(struct igc_adapter *adapter)
+{
+	u8 empty[ETH_ALEN] = { };
+
+	return igc_add_mac_filter(adapter, IGC_MAC_FILTER_TYPE_DST, empty, -1);
+}
+
+void igc_disable_empty_addr_recv(struct igc_adapter *adapter)
+{
+	u8 empty[ETH_ALEN] = { };
+
+	igc_del_mac_filter(adapter, IGC_MAC_FILTER_TYPE_DST, empty);
+}
+
 /**
  * igc_set_rx_mode - Secondary Unicast, Multicast and Promiscuous mode set
  * @netdev: network interface device structure
@@ -5755,6 +5793,10 @@ static void igc_watchdog_task(struct work_struct *work)
 			 */
 			igc_tsn_adjust_txtime_offset(adapter);
 
+			if (adapter->fpe.mmsv.pmac_enabled)
+				ethtool_mmsv_link_state_handle(&adapter->fpe.mmsv,
+							       true);
+
 			if (adapter->link_speed != SPEED_1000)
 				goto no_wait;
 
@@ -5790,6 +5832,10 @@ static void igc_watchdog_task(struct work_struct *work)
 			netdev_info(netdev, "NIC Link is Down\n");
 			netif_carrier_off(netdev);
 
+			if (adapter->fpe.mmsv.pmac_enabled)
+				ethtool_mmsv_link_state_handle(&adapter->fpe.mmsv,
+							       false);
+
 			/* link state has changed, schedule phy info update */
 			if (!test_bit(__IGC_DOWN, &adapter->state))
 				mod_timer(&adapter->phy_info_timer,
@@ -7110,6 +7156,8 @@ static int igc_probe(struct pci_dev *pdev,
 
 	igc_tsn_clear_schedule(adapter);
 
+	igc_fpe_init(adapter);
+
 	/* reset the hardware with the new settings */
 	igc_reset(adapter);
 
diff --git a/drivers/net/ethernet/intel/igc/igc_tsn.c b/drivers/net/ethernet/intel/igc/igc_tsn.c
index f0213cfce07d..52ff6f324d79 100644
--- a/drivers/net/ethernet/intel/igc/igc_tsn.c
+++ b/drivers/net/ethernet/intel/igc/igc_tsn.c
@@ -1,10 +1,153 @@
 // SPDX-License-Identifier: GPL-2.0
 /* Copyright (c)  2019 Intel Corporation */
 
+#include <linux/kernel.h>
 #include "igc.h"
+#include "igc_base.h"
 #include "igc_hw.h"
 #include "igc_tsn.h"
 
+DEFINE_STATIC_KEY_FALSE(igc_fpe_enabled);
+
+static int igc_fpe_init_smd_frame(struct igc_ring *ring,
+				  struct igc_tx_buffer *buffer,
+				  struct sk_buff *skb)
+{
+	dma_addr_t dma = dma_map_single(ring->dev, skb->data, skb->len,
+					DMA_TO_DEVICE);
+
+	if (dma_mapping_error(ring->dev, dma)) {
+		netdev_err_once(ring->netdev, "Failed to map DMA for TX\n");
+		return -ENOMEM;
+	}
+
+	buffer->skb = skb;
+	buffer->protocol = 0;
+	buffer->bytecount = skb->len;
+	buffer->gso_segs = 1;
+	buffer->time_stamp = jiffies;
+	dma_unmap_len_set(buffer, len, skb->len);
+	dma_unmap_addr_set(buffer, dma, dma);
+
+	return 0;
+}
+
+static int igc_fpe_init_tx_descriptor(struct igc_ring *ring,
+				      struct sk_buff *skb, int type)
+{
+	struct igc_tx_buffer *buffer;
+	union igc_adv_tx_desc *desc;
+	u32 cmd_type, olinfo_status;
+	int err;
+
+	if (!igc_desc_unused(ring))
+		return -EBUSY;
+
+	if (type != IGC_TXD_POPTS_SMD_V && type != IGC_TXD_POPTS_SMD_R)
+		return -EINVAL;
+
+	buffer = &ring->tx_buffer_info[ring->next_to_use];
+	err = igc_fpe_init_smd_frame(ring, buffer, skb);
+	if (err)
+		return err;
+
+	cmd_type = IGC_ADVTXD_DTYP_DATA | IGC_ADVTXD_DCMD_DEXT |
+		   IGC_ADVTXD_DCMD_IFCS | IGC_TXD_DCMD |
+		   buffer->bytecount;
+	olinfo_status = buffer->bytecount << IGC_ADVTXD_PAYLEN_SHIFT;
+
+	switch (type) {
+	case IGC_TXD_POPTS_SMD_V:
+		olinfo_status |= (IGC_TXD_POPTS_SMD_V << IGC_TXD_POPTS_SMD_SHIFT);
+		break;
+	case IGC_TXD_POPTS_SMD_R:
+		olinfo_status |= (IGC_TXD_POPTS_SMD_R << IGC_TXD_POPTS_SMD_SHIFT);
+		break;
+	}
+
+	desc = IGC_TX_DESC(ring, ring->next_to_use);
+	desc->read.cmd_type_len = cpu_to_le32(cmd_type);
+	desc->read.olinfo_status = cpu_to_le32(olinfo_status);
+	desc->read.buffer_addr = cpu_to_le64(dma_unmap_addr(buffer, dma));
+
+	netdev_tx_sent_queue(txring_txq(ring), skb->len);
+
+	buffer->next_to_watch = desc;
+	ring->next_to_use = (ring->next_to_use + 1) % ring->count;
+
+	return 0;
+}
+
+bool igc_fpe_transmitted_smd_v(union igc_adv_tx_desc *tx_desc)
+{
+	u8 smd = FIELD_GET(IGC_TXD_POPTS_SMD_MASK, tx_desc->read.olinfo_status);
+
+	return smd == IGC_TXD_POPTS_SMD_V;
+}
+
+static int igc_fpe_xmit_smd_frame(struct igc_adapter *adapter, int type)
+{
+	int cpu = smp_processor_id();
+	struct netdev_queue *nq;
+	struct igc_ring *ring;
+	struct sk_buff *skb;
+	void *data;
+	int err;
+
+	if (!netif_running(adapter->netdev))
+		return -ENOTCONN;
+
+	ring = igc_get_tx_ring(adapter, cpu);
+	nq = txring_txq(ring);
+
+	skb = alloc_skb(SMD_FRAME_SIZE, GFP_KERNEL);
+	if (!skb)
+		return -ENOMEM;
+
+	data = skb_put(skb, SMD_FRAME_SIZE);
+	memset(data, 0, SMD_FRAME_SIZE);
+
+	__netif_tx_lock(nq, cpu);
+
+	err = igc_fpe_init_tx_descriptor(ring, skb, type);
+	igc_flush_tx_descriptors(ring);
+
+	__netif_tx_unlock(nq);
+
+	return err;
+}
+
+static void igc_fpe_send_mpacket(struct ethtool_mmsv *mmsv,
+				 enum ethtool_mpacket type)
+{
+	struct fpe_t *fpe = container_of(mmsv, struct fpe_t, mmsv);
+	struct igc_adapter *adapter;
+	int err;
+
+	adapter = container_of(fpe, struct igc_adapter, fpe);
+
+	if (type == ETHTOOL_MPACKET_VERIFY) {
+		err = igc_fpe_xmit_smd_frame(adapter, IGC_TXD_POPTS_SMD_V);
+		if (err)
+			netdev_err(adapter->netdev, "Error sending SMD-V\n");
+	} else if (type == ETHTOOL_MPACKET_RESPONSE) {
+		err = igc_fpe_xmit_smd_frame(adapter, IGC_TXD_POPTS_SMD_R);
+		if (err)
+			netdev_err(adapter->netdev, "Error sending SMD-R frame\n");
+	}
+}
+
+static const struct ethtool_mmsv_ops igc_mmsv_ops = {
+	.send_mpacket = igc_fpe_send_mpacket,
+};
+
+void igc_fpe_init(struct igc_adapter *adapter)
+{
+	struct fpe_t *fpe = &adapter->fpe;
+
+	ethtool_mmsv_init(&adapter->fpe.mmsv, adapter->netdev, &igc_mmsv_ops);
+}
+
 static bool is_any_launchtime(struct igc_adapter *adapter)
 {
 	int i;
@@ -49,6 +192,9 @@ static unsigned int igc_tsn_new_flags(struct igc_adapter *adapter)
 	if (adapter->strict_priority_enable)
 		new_flags |= IGC_FLAG_TSN_LEGACY_ENABLED;
 
+	if (adapter->fpe.mmsv.pmac_enabled)
+		new_flags |= IGC_FLAG_TSN_PREEMPT_ENABLED;
+
 	return new_flags;
 }
 
@@ -148,7 +294,8 @@ static int igc_tsn_disable_offload(struct igc_adapter *adapter)
 
 	tqavctrl = rd32(IGC_TQAVCTRL);
 	tqavctrl &= ~(IGC_TQAVCTRL_TRANSMIT_MODE_TSN |
-		      IGC_TQAVCTRL_ENHANCED_QAV | IGC_TQAVCTRL_FUTSCDDIS);
+		      IGC_TQAVCTRL_ENHANCED_QAV | IGC_TQAVCTRL_FUTSCDDIS |
+		      IGC_TQAVCTRL_PREEMPT_ENA);
 
 	wr32(IGC_TQAVCTRL, tqavctrl);
 
@@ -370,10 +517,14 @@ static int igc_tsn_enable_offload(struct igc_adapter *adapter)
 		wr32(IGC_TXQCTL(i), txqctl);
 	}
 
-	tqavctrl = rd32(IGC_TQAVCTRL) & ~IGC_TQAVCTRL_FUTSCDDIS;
+	tqavctrl = rd32(IGC_TQAVCTRL) & ~(IGC_TQAVCTRL_FUTSCDDIS |
+		   IGC_TQAVCTRL_PREEMPT_ENA);
 
 	tqavctrl |= IGC_TQAVCTRL_TRANSMIT_MODE_TSN | IGC_TQAVCTRL_ENHANCED_QAV;
 
+	if (adapter->fpe.mmsv.pmac_enabled)
+		tqavctrl |= IGC_TQAVCTRL_PREEMPT_ENA;
+
 	adapter->qbv_count++;
 
 	cycle = adapter->cycle_time;
@@ -434,6 +585,11 @@ int igc_tsn_reset(struct igc_adapter *adapter)
 	unsigned int new_flags;
 	int err = 0;
 
+	if (adapter->fpe.mmsv.pmac_enabled)
+		igc_enable_empty_addr_recv(adapter);
+	else
+		igc_disable_empty_addr_recv(adapter);
+
 	new_flags = igc_tsn_new_flags(adapter);
 
 	if (!(new_flags & IGC_FLAG_TSN_ANY_ENABLED))
diff --git a/drivers/net/ethernet/intel/igc/igc_tsn.h b/drivers/net/ethernet/intel/igc/igc_tsn.h
index 98ec845a86bf..889d70c6d827 100644
--- a/drivers/net/ethernet/intel/igc/igc_tsn.h
+++ b/drivers/net/ethernet/intel/igc/igc_tsn.h
@@ -4,9 +4,42 @@
 #ifndef _IGC_TSN_H_
 #define _IGC_TSN_H_
 
+#define SMD_FRAME_SIZE			60
+
+DECLARE_STATIC_KEY_FALSE(igc_fpe_enabled);
+
+void igc_fpe_init(struct igc_adapter *adapter);
+u32 igc_fpe_get_supported_frag_size(u32 user_frag_size);
+bool igc_fpe_transmitted_smd_v(union igc_adv_tx_desc *tx_desc);
 int igc_tsn_offload_apply(struct igc_adapter *adapter);
 int igc_tsn_reset(struct igc_adapter *adapter);
 void igc_tsn_adjust_txtime_offset(struct igc_adapter *adapter);
 bool igc_tsn_is_taprio_activated_by_user(struct igc_adapter *adapter);
 
+static inline void igc_fpe_lp_event_status(union igc_adv_rx_desc *rx_desc,
+					   struct ethtool_mmsv *mmsv)
+{
+	__le32 status_error = le32_to_cpu(rx_desc->wb.upper.status_error);
+	int smd;
+
+	smd = FIELD_GET(IGC_RXDADV_STAT_SMD_TYPE_MASK, status_error);
+
+	if (smd == IGC_RXD_STAT_SMD_TYPE_V)
+		ethtool_mmsv_event_handle(mmsv, ETHTOOL_MMSV_LP_SENT_VERIFY_MPACKET);
+	else if (smd == IGC_RXD_STAT_SMD_TYPE_R)
+		ethtool_mmsv_event_handle(mmsv, ETHTOOL_MMSV_LP_SENT_RESPONSE_MPACKET);
+}
+
+static inline bool igc_fpe_is_verify_or_response(union igc_adv_rx_desc *rx_desc,
+						 unsigned int size)
+{
+	__le32 status_error = le32_to_cpu(rx_desc->wb.upper.status_error);
+	int smd;
+
+	smd = FIELD_GET(IGC_RXDADV_STAT_SMD_TYPE_MASK, status_error);
+
+	return ((smd == IGC_RXD_STAT_SMD_TYPE_V || smd == IGC_RXD_STAT_SMD_TYPE_R) &&
+		size == SMD_FRAME_SIZE);
+}
+
 #endif /* _IGC_BASE_H */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH iwl-next v2 6/9] igc: Add support to set tx-min-frag-size
  2025-02-05 10:05 [PATCH iwl-next v2 0/9] igc: Add support for Frame Preemption feature in IGC Faizal Rahim
                   ` (4 preceding siblings ...)
  2025-02-05 10:05 ` [PATCH iwl-next v2 5/9] igc: Add support for frame preemption verification Faizal Rahim
@ 2025-02-05 10:05 ` Faizal Rahim
  2025-02-05 10:05 ` [PATCH iwl-next v2 7/9] igc: Add support for preemptible traffic class in taprio Faizal Rahim
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: Faizal Rahim @ 2025-02-05 10:05 UTC (permalink / raw)
  To: Tony Nguyen, Przemek Kitszel, Andrew Lunn, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Maxime Coquelin,
	Alexandre Torgue, Simon Horman, Russell King, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Russell King, Furong Xu, Vladimir Oltean, Serge Semin,
	Xiaolei Wang, Suraj Jaiswal, Kory Maincent, Petr Tesarik,
	Faizal Rahim, Choong Yong Liang, Kunihiko Hayashi,
	Vinicius Costa Gomes, intel-wired-lan, netdev, linux-kernel,
	linux-stm32, linux-arm-kernel, bpf

Add support to set tx-min-frag-size via set_mm callback in igc.
Increase the max limit of tx-ming-frag-size in ethtool from 252 to 256
since i225/6 value range is 64, 128, 192 and 256.

Co-developed-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Faizal Rahim <faizal.abdul.rahim@linux.intel.com>
---
 drivers/net/ethernet/intel/igc/igc.h         |  1 +
 drivers/net/ethernet/intel/igc/igc_defines.h |  2 ++
 drivers/net/ethernet/intel/igc/igc_ethtool.c |  5 +++
 drivers/net/ethernet/intel/igc/igc_tsn.c     | 37 ++++++++++++++++++--
 drivers/net/ethernet/intel/igc/igc_tsn.h     |  2 +-
 net/ethtool/mm.c                             |  2 +-
 6 files changed, 44 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/intel/igc/igc.h b/drivers/net/ethernet/intel/igc/igc.h
index 705bd4739e3b..2f3662143589 100644
--- a/drivers/net/ethernet/intel/igc/igc.h
+++ b/drivers/net/ethernet/intel/igc/igc.h
@@ -42,6 +42,7 @@ void igc_ethtool_set_ops(struct net_device *);
 
 struct fpe_t {
 	struct ethtool_mmsv mmsv;
+	u32 tx_min_frag_size;
 };
 
 enum igc_mac_filter_type {
diff --git a/drivers/net/ethernet/intel/igc/igc_defines.h b/drivers/net/ethernet/intel/igc/igc_defines.h
index 563b154c3e94..9d7d903545ed 100644
--- a/drivers/net/ethernet/intel/igc/igc_defines.h
+++ b/drivers/net/ethernet/intel/igc/igc_defines.h
@@ -553,6 +553,8 @@
 #define IGC_TQAVCTRL_PREEMPT_ENA	0x00000002
 #define IGC_TQAVCTRL_ENHANCED_QAV	0x00000008
 #define IGC_TQAVCTRL_FUTSCDDIS		0x00000080
+#define IGC_TQAVCTRL_MIN_FRAG_MASK	0x0000C000
+#define IGC_TQAVCTRL_MIN_FRAG_SHIFT	14
 
 #define IGC_TXQCTL_QUEUE_MODE_LAUNCHT	0x00000001
 #define IGC_TXQCTL_STRICT_CYCLE		0x00000002
diff --git a/drivers/net/ethernet/intel/igc/igc_ethtool.c b/drivers/net/ethernet/intel/igc/igc_ethtool.c
index 41cb03816f92..83da59aec59e 100644
--- a/drivers/net/ethernet/intel/igc/igc_ethtool.c
+++ b/drivers/net/ethernet/intel/igc/igc_ethtool.c
@@ -1789,6 +1789,11 @@ static int igc_ethtool_set_mm(struct net_device *netdev,
 	struct igc_adapter *adapter = netdev_priv(netdev);
 	struct fpe_t *fpe = &adapter->fpe;
 
+	fpe->tx_min_frag_size = igc_fpe_get_supported_frag_size(cmd->tx_min_frag_size);
+	if (fpe->tx_min_frag_size != cmd->tx_min_frag_size)
+		NL_SET_ERR_MSG_MOD(extack,
+				   "tx-min-frag-size value set is unsupported. Rounded up to supported value (64, 128, 192, 256)");
+
 	ethtool_mmsv_set_mm(&fpe->mmsv, cmd);
 
 	if (fpe->mmsv.pmac_enabled)
diff --git a/drivers/net/ethernet/intel/igc/igc_tsn.c b/drivers/net/ethernet/intel/igc/igc_tsn.c
index 52ff6f324d79..50c343c59e29 100644
--- a/drivers/net/ethernet/intel/igc/igc_tsn.c
+++ b/drivers/net/ethernet/intel/igc/igc_tsn.c
@@ -7,6 +7,12 @@
 #include "igc_hw.h"
 #include "igc_tsn.h"
 
+#define MIN_MULTPLIER_TX_MIN_FRAG	0
+#define MAX_MULTPLIER_TX_MIN_FRAG	3
+/* Frag size is based on the Section 8.12.2 of the SW User Manual */
+#define TX_MIN_FRAG_SIZE		64
+#define TX_MAX_FRAG_SIZE	(TX_MIN_FRAG_SIZE * (MAX_MULTPLIER_TX_MIN_FRAG + 1))
+
 DEFINE_STATIC_KEY_FALSE(igc_fpe_enabled);
 
 static int igc_fpe_init_smd_frame(struct igc_ring *ring,
@@ -145,6 +151,7 @@ void igc_fpe_init(struct igc_adapter *adapter)
 {
 	struct fpe_t *fpe = &adapter->fpe;
 
+	fpe->tx_min_frag_size = TX_MIN_FRAG_SIZE;
 	ethtool_mmsv_init(&adapter->fpe.mmsv, adapter->netdev, &igc_mmsv_ops);
 }
 
@@ -295,7 +302,7 @@ static int igc_tsn_disable_offload(struct igc_adapter *adapter)
 	tqavctrl = rd32(IGC_TQAVCTRL);
 	tqavctrl &= ~(IGC_TQAVCTRL_TRANSMIT_MODE_TSN |
 		      IGC_TQAVCTRL_ENHANCED_QAV | IGC_TQAVCTRL_FUTSCDDIS |
-		      IGC_TQAVCTRL_PREEMPT_ENA);
+		      IGC_TQAVCTRL_PREEMPT_ENA | IGC_TQAVCTRL_MIN_FRAG_MASK);
 
 	wr32(IGC_TQAVCTRL, tqavctrl);
 
@@ -341,12 +348,34 @@ static void igc_tsn_set_retx_qbvfullthreshold(struct igc_adapter *adapter)
 	wr32(IGC_RETX_CTL, retxctl);
 }
 
+static u8 igc_fpe_get_frag_size_mult(const struct fpe_t *fpe)
+{
+	u8 mult = (fpe->tx_min_frag_size / TX_MIN_FRAG_SIZE) - 1;
+
+	return clamp_t(u8, mult, MIN_MULTPLIER_TX_MIN_FRAG,
+		       MAX_MULTPLIER_TX_MIN_FRAG);
+}
+
+u32 igc_fpe_get_supported_frag_size(u32 frag_size)
+{
+	const u32 supported_sizes[] = {64, 128, 192, 256};
+
+	/* Find the smallest supported size that is >= frag_size */
+	for (int i = 0; i < ARRAY_SIZE(supported_sizes); i++) {
+		if (frag_size <= supported_sizes[i])
+			return supported_sizes[i];
+	}
+
+	return TX_MAX_FRAG_SIZE; /* Should not happen, value > 256 is blocked by ethtool */
+}
+
 static int igc_tsn_enable_offload(struct igc_adapter *adapter)
 {
 	struct igc_hw *hw = &adapter->hw;
 	u32 tqavctrl, baset_l, baset_h;
 	u32 sec, nsec, cycle, rxpbs;
 	ktime_t base_time, systim;
+	u32 frag_size_mult;
 	int i;
 
 	wr32(IGC_TSAUXC, 0);
@@ -518,13 +547,15 @@ static int igc_tsn_enable_offload(struct igc_adapter *adapter)
 	}
 
 	tqavctrl = rd32(IGC_TQAVCTRL) & ~(IGC_TQAVCTRL_FUTSCDDIS |
-		   IGC_TQAVCTRL_PREEMPT_ENA);
-
+		   IGC_TQAVCTRL_PREEMPT_ENA | IGC_TQAVCTRL_MIN_FRAG_MASK);
 	tqavctrl |= IGC_TQAVCTRL_TRANSMIT_MODE_TSN | IGC_TQAVCTRL_ENHANCED_QAV;
 
 	if (adapter->fpe.mmsv.pmac_enabled)
 		tqavctrl |= IGC_TQAVCTRL_PREEMPT_ENA;
 
+	frag_size_mult = igc_fpe_get_frag_size_mult(&adapter->fpe);
+	tqavctrl |= frag_size_mult << IGC_TQAVCTRL_MIN_FRAG_SHIFT;
+
 	adapter->qbv_count++;
 
 	cycle = adapter->cycle_time;
diff --git a/drivers/net/ethernet/intel/igc/igc_tsn.h b/drivers/net/ethernet/intel/igc/igc_tsn.h
index 889d70c6d827..898c4630bc70 100644
--- a/drivers/net/ethernet/intel/igc/igc_tsn.h
+++ b/drivers/net/ethernet/intel/igc/igc_tsn.h
@@ -9,7 +9,7 @@
 DECLARE_STATIC_KEY_FALSE(igc_fpe_enabled);
 
 void igc_fpe_init(struct igc_adapter *adapter);
-u32 igc_fpe_get_supported_frag_size(u32 user_frag_size);
+u32 igc_fpe_get_supported_frag_size(u32 frag_size);
 bool igc_fpe_transmitted_smd_v(union igc_adv_tx_desc *tx_desc);
 int igc_tsn_offload_apply(struct igc_adapter *adapter);
 int igc_tsn_reset(struct igc_adapter *adapter);
diff --git a/net/ethtool/mm.c b/net/ethtool/mm.c
index 2ebb811948ec..1320d26f4bc9 100644
--- a/net/ethtool/mm.c
+++ b/net/ethtool/mm.c
@@ -152,7 +152,7 @@ const struct nla_policy ethnl_mm_set_policy[ETHTOOL_A_MM_MAX + 1] = {
 	[ETHTOOL_A_MM_VERIFY_TIME]	= NLA_POLICY_RANGE(NLA_U32, 1, 128),
 	[ETHTOOL_A_MM_TX_ENABLED]	= NLA_POLICY_MAX(NLA_U8, 1),
 	[ETHTOOL_A_MM_PMAC_ENABLED]	= NLA_POLICY_MAX(NLA_U8, 1),
-	[ETHTOOL_A_MM_TX_MIN_FRAG_SIZE]	= NLA_POLICY_RANGE(NLA_U32, 60, 252),
+	[ETHTOOL_A_MM_TX_MIN_FRAG_SIZE]	= NLA_POLICY_RANGE(NLA_U32, 60, 256),
 };
 
 static void mm_state_to_cfg(const struct ethtool_mm_state *state,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH iwl-next v2 7/9] igc: Add support for preemptible traffic class in taprio
  2025-02-05 10:05 [PATCH iwl-next v2 0/9] igc: Add support for Frame Preemption feature in IGC Faizal Rahim
                   ` (5 preceding siblings ...)
  2025-02-05 10:05 ` [PATCH iwl-next v2 6/9] igc: Add support to set tx-min-frag-size Faizal Rahim
@ 2025-02-05 10:05 ` Faizal Rahim
  2025-02-05 10:05 ` [PATCH iwl-next v2 8/9] igc: Add support to get MAC Merge data via ethtool Faizal Rahim
  2025-02-05 10:05 ` [PATCH iwl-next v2 9/9] igc: Add support to get frame preemption statistics " Faizal Rahim
  8 siblings, 0 replies; 14+ messages in thread
From: Faizal Rahim @ 2025-02-05 10:05 UTC (permalink / raw)
  To: Tony Nguyen, Przemek Kitszel, Andrew Lunn, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Maxime Coquelin,
	Alexandre Torgue, Simon Horman, Russell King, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Russell King, Furong Xu, Vladimir Oltean, Serge Semin,
	Xiaolei Wang, Suraj Jaiswal, Kory Maincent, Petr Tesarik,
	Faizal Rahim, Choong Yong Liang, Kunihiko Hayashi,
	Vinicius Costa Gomes, intel-wired-lan, netdev, linux-kernel,
	linux-stm32, linux-arm-kernel, bpf

Set queue as preemptible or express for taprio.
This will eventually set queue-specific preemptible field in TXQCTL
register.

Verified that the correct preemptible hardware queue is set using the
following commands:

a) 1:1 TC-to-Queue Mapping
   $ sudo tc qdisc replace dev enp1s0 parent root handle 100 \
     taprio num_tc 4 map 3 2 1 0 3 3 3 3 3 3 3 3 3 3 3 3 \
     queues 1@0 1@1 1@2 1@3 base-time 0 sched-entry S F 100000 \
     fp E E P P

b) Non-1:1 TC-to-Queue Mapping
   $ sudo tc qdisc replace  dev enp1s0 parent root handle 100 \
     taprio num_tc 3 map 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 2
     queues 2@0 1@2 1@3
     fp E E P

Co-developed-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Faizal Rahim <faizal.abdul.rahim@linux.intel.com>
---
 drivers/net/ethernet/intel/igc/igc.h         |  2 +-
 drivers/net/ethernet/intel/igc/igc_defines.h |  1 +
 drivers/net/ethernet/intel/igc/igc_main.c    | 36 ++++++++++++++++++++
 drivers/net/ethernet/intel/igc/igc_tsn.c     |  4 +++
 4 files changed, 42 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/igc/igc.h b/drivers/net/ethernet/intel/igc/igc.h
index 2f3662143589..e9f7a75aef13 100644
--- a/drivers/net/ethernet/intel/igc/igc.h
+++ b/drivers/net/ethernet/intel/igc/igc.h
@@ -163,7 +163,7 @@ struct igc_ring {
 	bool launchtime_enable;         /* true if LaunchTime is enabled */
 	ktime_t last_tx_cycle;          /* end of the cycle with a launchtime transmission */
 	ktime_t last_ff_cycle;          /* Last cycle with an active first flag */
-
+	bool preemptible;               /* True if not express */
 	u32 start_time;
 	u32 end_time;
 	u32 max_sdu;
diff --git a/drivers/net/ethernet/intel/igc/igc_defines.h b/drivers/net/ethernet/intel/igc/igc_defines.h
index 9d7d903545ed..d7b4f5b6e548 100644
--- a/drivers/net/ethernet/intel/igc/igc_defines.h
+++ b/drivers/net/ethernet/intel/igc/igc_defines.h
@@ -559,6 +559,7 @@
 #define IGC_TXQCTL_QUEUE_MODE_LAUNCHT	0x00000001
 #define IGC_TXQCTL_STRICT_CYCLE		0x00000002
 #define IGC_TXQCTL_STRICT_END		0x00000004
+#define IGC_TXQCTL_PREEMPTIBLE		0x00000008
 #define IGC_TXQCTL_QAV_SEL_MASK		0x000000C0
 #define IGC_TXQCTL_QAV_SEL_CBS0		0x00000080
 #define IGC_TXQCTL_QAV_SEL_CBS1		0x000000C0
diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index 396410522e01..6f7f4c641393 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -6254,6 +6254,39 @@ static bool is_base_time_past(ktime_t base_time, const struct timespec64 *now)
 	return timespec64_compare(now, &b) > 0;
 }
 
+static u32 igc_map_tc_to_queue(const struct igc_adapter *adapter,
+			       unsigned long preemptible_tcs)
+{
+	struct net_device *dev = adapter->netdev;
+	u32 i, queue = 0;
+
+	for (i = 0; i < dev->num_tc; i++) {
+		u32 offset, count;
+
+		if (!(preemptible_tcs & BIT(i)))
+			continue;
+
+		offset = dev->tc_to_txq[i].offset;
+		count = dev->tc_to_txq[i].count;
+		queue |= GENMASK(offset + count - 1, offset);
+	}
+
+	return queue;
+}
+
+static void igc_save_preempt_queue(struct igc_adapter *adapter,
+				   const struct tc_mqprio_qopt_offload *mqprio)
+{
+	u32 preemptible_queue = igc_map_tc_to_queue(adapter,
+						    mqprio->preemptible_tcs);
+
+	for (int i = 0; i < adapter->num_tx_queues; i++) {
+		struct igc_ring *tx_ring = adapter->tx_ring[i];
+
+		tx_ring->preemptible = preemptible_queue & BIT(i);
+	}
+}
+
 static bool validate_schedule(struct igc_adapter *adapter,
 			      const struct tc_taprio_qopt_offload *qopt)
 {
@@ -6340,6 +6373,7 @@ static int igc_qbv_clear_schedule(struct igc_adapter *adapter)
 		ring->start_time = 0;
 		ring->end_time = NSEC_PER_SEC;
 		ring->max_sdu = 0;
+		ring->preemptible = false;
 	}
 
 	spin_lock_irqsave(&adapter->qbv_tx_lock, flags);
@@ -6496,6 +6530,8 @@ static int igc_save_qbv_schedule(struct igc_adapter *adapter,
 			ring->max_sdu = 0;
 	}
 
+	igc_save_preempt_queue(adapter, &qopt->mqprio);
+
 	return 0;
 }
 
diff --git a/drivers/net/ethernet/intel/igc/igc_tsn.c b/drivers/net/ethernet/intel/igc/igc_tsn.c
index 50c343c59e29..61ad464ed053 100644
--- a/drivers/net/ethernet/intel/igc/igc_tsn.c
+++ b/drivers/net/ethernet/intel/igc/igc_tsn.c
@@ -459,6 +459,10 @@ static int igc_tsn_enable_offload(struct igc_adapter *adapter)
 		if (ring->launchtime_enable)
 			txqctl |= IGC_TXQCTL_QUEUE_MODE_LAUNCHT;
 
+		if (ethtool_mmsv_is_tx_active(&adapter->fpe.mmsv) &&
+		    ring->preemptible)
+			txqctl |= IGC_TXQCTL_PREEMPTIBLE;
+
 		/* Skip configuring CBS for Q2 and Q3 */
 		if (i > 1)
 			goto skip_cbs;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH iwl-next v2 8/9] igc: Add support to get MAC Merge data via ethtool
  2025-02-05 10:05 [PATCH iwl-next v2 0/9] igc: Add support for Frame Preemption feature in IGC Faizal Rahim
                   ` (6 preceding siblings ...)
  2025-02-05 10:05 ` [PATCH iwl-next v2 7/9] igc: Add support for preemptible traffic class in taprio Faizal Rahim
@ 2025-02-05 10:05 ` Faizal Rahim
  2025-02-05 10:05 ` [PATCH iwl-next v2 9/9] igc: Add support to get frame preemption statistics " Faizal Rahim
  8 siblings, 0 replies; 14+ messages in thread
From: Faizal Rahim @ 2025-02-05 10:05 UTC (permalink / raw)
  To: Tony Nguyen, Przemek Kitszel, Andrew Lunn, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Maxime Coquelin,
	Alexandre Torgue, Simon Horman, Russell King, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Russell King, Furong Xu, Vladimir Oltean, Serge Semin,
	Xiaolei Wang, Suraj Jaiswal, Kory Maincent, Petr Tesarik,
	Faizal Rahim, Choong Yong Liang, Kunihiko Hayashi,
	Vinicius Costa Gomes, intel-wired-lan, netdev, linux-kernel,
	linux-stm32, linux-arm-kernel, bpf

Implement "ethtool --show-mm" callback for IGC.

Tested with command:
$ ethtool --show-mm enp1s0.
  MAC Merge layer state for enp1s0:
  pMAC enabled: on
  TX enabled: on
  TX active: on
  TX minimum fragment size: 64
  RX minimum fragment size: 60
  Verify enabled: on
  Verify time: 128
  Max verify time: 128
  Verification status: SUCCEEDED

Verified that the fields value are retrieved correctly.

Signed-off-by: Faizal Rahim <faizal.abdul.rahim@linux.intel.com>
---
 drivers/net/ethernet/intel/igc/igc_ethtool.c | 14 ++++++++++++++
 drivers/net/ethernet/intel/igc/igc_tsn.h     |  1 +
 2 files changed, 15 insertions(+)

diff --git a/drivers/net/ethernet/intel/igc/igc_ethtool.c b/drivers/net/ethernet/intel/igc/igc_ethtool.c
index 83da59aec59e..0140f2258823 100644
--- a/drivers/net/ethernet/intel/igc/igc_ethtool.c
+++ b/drivers/net/ethernet/intel/igc/igc_ethtool.c
@@ -1782,6 +1782,19 @@ static int igc_ethtool_set_eee(struct net_device *netdev,
 	return 0;
 }
 
+static int igc_ethtool_get_mm(struct net_device *netdev,
+			      struct ethtool_mm_state *cmd)
+{
+	struct igc_adapter *adapter = netdev_priv(netdev);
+	struct fpe_t *fpe = &adapter->fpe;
+
+	ethtool_mmsv_get_mm(&fpe->mmsv, cmd);
+	cmd->tx_min_frag_size = fpe->tx_min_frag_size;
+	cmd->rx_min_frag_size = IGC_RX_MIN_FRAG_SIZE;
+
+	return 0;
+}
+
 static int igc_ethtool_set_mm(struct net_device *netdev,
 			      struct ethtool_mm_cfg *cmd,
 			      struct netlink_ext_ack *extack)
@@ -2091,6 +2104,7 @@ static const struct ethtool_ops igc_ethtool_ops = {
 	.set_rxfh		= igc_ethtool_set_rxfh,
 	.get_ts_info		= igc_ethtool_get_ts_info,
 	.get_channels		= igc_ethtool_get_channels,
+	.get_mm			= igc_ethtool_get_mm,
 	.set_mm			= igc_ethtool_set_mm,
 	.set_channels		= igc_ethtool_set_channels,
 	.get_priv_flags		= igc_ethtool_get_priv_flags,
diff --git a/drivers/net/ethernet/intel/igc/igc_tsn.h b/drivers/net/ethernet/intel/igc/igc_tsn.h
index 898c4630bc70..c82f9718cb85 100644
--- a/drivers/net/ethernet/intel/igc/igc_tsn.h
+++ b/drivers/net/ethernet/intel/igc/igc_tsn.h
@@ -4,6 +4,7 @@
 #ifndef _IGC_TSN_H_
 #define _IGC_TSN_H_
 
+#define IGC_RX_MIN_FRAG_SIZE		60
 #define SMD_FRAME_SIZE			60
 
 DECLARE_STATIC_KEY_FALSE(igc_fpe_enabled);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH iwl-next v2 9/9] igc: Add support to get frame preemption statistics via ethtool
  2025-02-05 10:05 [PATCH iwl-next v2 0/9] igc: Add support for Frame Preemption feature in IGC Faizal Rahim
                   ` (7 preceding siblings ...)
  2025-02-05 10:05 ` [PATCH iwl-next v2 8/9] igc: Add support to get MAC Merge data via ethtool Faizal Rahim
@ 2025-02-05 10:05 ` Faizal Rahim
  8 siblings, 0 replies; 14+ messages in thread
From: Faizal Rahim @ 2025-02-05 10:05 UTC (permalink / raw)
  To: Tony Nguyen, Przemek Kitszel, Andrew Lunn, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Maxime Coquelin,
	Alexandre Torgue, Simon Horman, Russell King, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Russell King, Furong Xu, Vladimir Oltean, Serge Semin,
	Xiaolei Wang, Suraj Jaiswal, Kory Maincent, Petr Tesarik,
	Faizal Rahim, Choong Yong Liang, Kunihiko Hayashi,
	Vinicius Costa Gomes, intel-wired-lan, netdev, linux-kernel,
	linux-stm32, linux-arm-kernel, bpf

Implemented "ethtool --include-statistics --show-mm" callback for IGC.

Tested preemption scenario to check preemption statistics:
1) Trigger verification handshake on both boards:
    $ sudo ethtool --set-mm enp1s0 pmac-enabled on
    $ sudo ethtool --set-mm enp1s0 tx-enabled on
    $ sudo ethtool --set-mm enp1s0 verify-enabled on
2) Set preemptible or express queue in taprio for tx board:
    $ sudo tc qdisc replace dev enp1s0 parent root handle 100 taprio \
      num_tc 4 map 3 2 1 0 3 3 3 3 3 3 3 3 3 3 3 3 \
      queues 1@0 1@1 1@2 1@3 base-time 0 sched-entry S F 100000 \
      fp E E P P
3) Send large size packets on preemptible queue
4) Send small size packets on express queue to preempt packets in
   preemptible queue
5) Show preemption statistics on the receiving board:
   $ ethtool --include-statistics --show-mm enp1s0
     MAC Merge layer state for enp1s0:
     pMAC enabled: on
     TX enabled: on
     TX active: on
     TX minimum fragment size: 64
     RX minimum fragment size: 60
     Verify enabled: on
     Verify time: 128
     Max verify time: 128
     Verification status: SUCCEEDED
     Statistics:
     	MACMergeFrameAssErrorCount: 0
	MACMergeFrameSmdErrorCount: 0
	MACMergeFrameAssOkCount: 511
	MACMergeFragCountRx: 764
	MACMergeFragCountTx: 0
	MACMergeHoldCount: 0

Co-developed-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Faizal Rahim <faizal.abdul.rahim@linux.intel.com>
---
 drivers/net/ethernet/intel/igc/igc_ethtool.c | 36 ++++++++++++++++++++
 drivers/net/ethernet/intel/igc/igc_regs.h    | 16 +++++++++
 2 files changed, 52 insertions(+)

diff --git a/drivers/net/ethernet/intel/igc/igc_ethtool.c b/drivers/net/ethernet/intel/igc/igc_ethtool.c
index 0140f2258823..521f5829cd02 100644
--- a/drivers/net/ethernet/intel/igc/igc_ethtool.c
+++ b/drivers/net/ethernet/intel/igc/igc_ethtool.c
@@ -1817,6 +1817,41 @@ static int igc_ethtool_set_mm(struct net_device *netdev,
 	return igc_tsn_offload_apply(adapter);
 }
 
+/**
+ * igc_ethtool_get_frame_ass_error - Get the frame assembly error count.
+ * @dev: Pointer to the net_device structure.
+ * Return: The count of frame assembly errors.
+ */
+static u64 igc_ethtool_get_frame_ass_error(struct net_device *dev)
+{
+	struct igc_adapter *adapter = netdev_priv(dev);
+	u32 ooo_smdc, ooo_frame_cnt, ooo_frag_cnt; /* Out of order statistics */
+	struct igc_hw *hw = &adapter->hw;
+	u32 miss_frame_frag_cnt;
+	u32 reg_value;
+
+	reg_value = rd32(IGC_PRMEXPRCNT);
+	ooo_smdc = FIELD_GET(IGC_PRMEXPRCNT_OOO_SMDC, reg_value);
+	ooo_frame_cnt = FIELD_GET(IGC_PRMEXPRCNT_OOO_FRAME_CNT, reg_value);
+	ooo_frag_cnt = FIELD_GET(IGC_PRMEXPRCNT_OOO_FRAG_CNT, reg_value);
+	miss_frame_frag_cnt = FIELD_GET(IGC_PRMEXPRCNT_MISS_FRAME_FRAG_CNT,
+					reg_value);
+
+	return ooo_smdc + ooo_frame_cnt + ooo_frag_cnt + miss_frame_frag_cnt;
+}
+
+static void igc_ethtool_get_mm_stats(struct net_device *dev,
+				     struct ethtool_mm_stats *stats)
+{
+	struct igc_adapter *adapter = netdev_priv(dev);
+	struct igc_hw *hw = &adapter->hw;
+
+	stats->MACMergeFrameAssErrorCount = igc_ethtool_get_frame_ass_error(dev);
+	stats->MACMergeFrameAssOkCount = rd32(IGC_PRMPTDRCNT);
+	stats->MACMergeFragCountRx =  rd32(IGC_PRMEVNTRCNT);
+	stats->MACMergeFragCountTx = rd32(IGC_PRMEVNTTCNT);
+}
+
 static int igc_ethtool_get_link_ksettings(struct net_device *netdev,
 					  struct ethtool_link_ksettings *cmd)
 {
@@ -2106,6 +2141,7 @@ static const struct ethtool_ops igc_ethtool_ops = {
 	.get_channels		= igc_ethtool_get_channels,
 	.get_mm			= igc_ethtool_get_mm,
 	.set_mm			= igc_ethtool_set_mm,
+	.get_mm_stats		= igc_ethtool_get_mm_stats,
 	.set_channels		= igc_ethtool_set_channels,
 	.get_priv_flags		= igc_ethtool_get_priv_flags,
 	.set_priv_flags		= igc_ethtool_set_priv_flags,
diff --git a/drivers/net/ethernet/intel/igc/igc_regs.h b/drivers/net/ethernet/intel/igc/igc_regs.h
index 12ddc5793651..41dbfb07eb2f 100644
--- a/drivers/net/ethernet/intel/igc/igc_regs.h
+++ b/drivers/net/ethernet/intel/igc/igc_regs.h
@@ -222,6 +222,22 @@
 
 #define IGC_FTQF(_n)	(0x059E0 + (4 * (_n)))  /* 5-tuple Queue Fltr */
 
+/* Time sync registers - preemption statistics */
+#define IGC_PRMPTDRCNT		0x04284	/* Good RX Preempted Packets */
+#define IGC_PRMEVNTTCNT		0x04298	/* TX Preemption event counter */
+#define IGC_PRMEVNTRCNT		0x0429C	/* RX Preemption event counter */
+
+ /* Preemption Exception Counter */
+#define IGC_PRMEXPRCNT					0x42A0
+/* Received out of order packets with SMD-C */
+#define IGC_PRMEXPRCNT_OOO_SMDC				0x000000FF
+/* Received out of order packets with SMD-C and wrong Frame CNT */
+#define IGC_PRMEXPRCNT_OOO_FRAME_CNT			0x0000FF00
+/* Received out of order packets with SMD-C and wrong Frag CNT */
+#define IGC_PRMEXPRCNT_OOO_FRAG_CNT			0x00FF0000
+/* Received packets with SMD-S and wrong Frag CNT and Frame CNT */
+#define IGC_PRMEXPRCNT_MISS_FRAME_FRAG_CNT		0xFF000000
+
 /* Transmit Scheduling Registers */
 #define IGC_TQAVCTRL		0x3570
 #define IGC_TXQCTL(_n)		(0x3344 + 0x4 * (_n))
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH iwl-next v2 5/9] igc: Add support for frame preemption verification
  2025-02-05 10:05 ` [PATCH iwl-next v2 5/9] igc: Add support for frame preemption verification Faizal Rahim
@ 2025-02-05 17:12   ` Vladimir Oltean
  2025-02-06 14:40     ` Abdul Rahim, Faizal
  0 siblings, 1 reply; 14+ messages in thread
From: Vladimir Oltean @ 2025-02-05 17:12 UTC (permalink / raw)
  To: Faizal Rahim
  Cc: Tony Nguyen, Przemek Kitszel, Andrew Lunn, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Maxime Coquelin,
	Alexandre Torgue, Simon Horman, Russell King, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Russell King, Furong Xu, Serge Semin, Xiaolei Wang, Suraj Jaiswal,
	Kory Maincent, Petr Tesarik, Choong Yong Liang, Kunihiko Hayashi,
	Vinicius Costa Gomes, intel-wired-lan, netdev, linux-kernel,
	linux-stm32, linux-arm-kernel, bpf

On Wed, Feb 05, 2025 at 05:05:20AM -0500, Faizal Rahim wrote:
> This patch implements the "ethtool --set-mm" callback to trigger the
> frame preemption verification handshake.
> 
> Uses the MAC Merge Software Verification (mmsv) mechanism in ethtool
> to perform the verification handshake for igc.
> The structure fpe.mmsv is set by mmsv in ethtool and should remain
> read-only for the driver.
> 
> igc does not use two mmsv callbacks:
> a) configure_tx()
>    - igc lacks registers to configure FPE in the transmit direction.

Yes, maybe, but it's still important to handle this. It tells you when
the preemptible traffic classes should be sent as preemptible on the wire
(i.e. when the verification is either disabled, or it succeeded).

There is a selftest called manual_failed_verification() which supposedly
tests this exact condition: if verification fails, then packets sent to
TC0 are supposed to bump the eMAC's TX counters, even though TC0 is
configured as preemptible. Otherwise stated: even if the tc program says
that a certain traffic class is preemptible, you don't want to actually
send preemptible packets if you haven't verified the link partner can
handle them, since it will likely drop them on RX otherwise.

The problem with that selftest is that it relies on the driver's ability
to report split ethtool counters for eMAC/pMAC, rather than just aggregate.
In lack of that, you need to know through some other mechanism whether
the link partner received those packets as preemptible or express, and
that's kind of hard to add to a selftest.

> b) configure_pmac()
>    - this callback dynamically controls pmac_enabled at runtime. For
>      example, mmsv calls configure_pmac() and disables pmac_enabled when
>      the link partner goes down, even if the user previously enabled it.
>      The intention is to save power but it is not feasible in igc
>      because it causes an endless adapter reset loop:
> 
> 	 1) Board A and Board B complete the verification handshake.
>             Tx mode register for both boards are in TSN mode.
> 	 2) Board B link goes down.
> 
>          On Board A:
> 	 3) mmsv calls configure_pmac() with pmac_enabled = false.
> 	 4) configure_pmac() in igc updates a new field based on
>             pmac_enabled. Driver uses this field in igc_tsn_new_flags()
>             to indicate that the user enabled/disabled FPE.
> 	 5) configure_pmac() in igc calls igc_tsn_offload_apply() to check
>             whether an adapter reset is needed. Calls existing logic in
>             igc_tsn_will_tx_mode_change() and igc_tsn_new_flags().
> 	 6) Since pmac_enabled is now disabled and no other TSN feature
>             is active, igc_tsn_will_tx_mode_change() evaluates to true
>             because Tx mode will switch from TSN to Legacy.
> 	 7) Driver resets the adapter.
> 	 8) Registers are set, and Tx mode switches to Legacy.
> 	 9) When link partner is up, steps 3–8 repeat, but this time
>             with pmac_enabled = true, reactivating TSN.
>             igc_tsn_will_tx_mode_change() evaluates to true again,
>             since Tx mode will switch from Legacy to TSN.
> 	10) Driver resets the adapter.
> 	11) Rest adapter completes, registers are set, and Tx mode
> 	    switches to TSN.
> 
>         On Board B:
> 	12) Adapter reset on Board A at step 10 causes it to detect its
>             link partner as down.

Is this using fiber/DAC, or twisted pair (RJ45 PHY)?

> 	13) Repeats steps 3–8.
> 	14) Once reset adapter on Board A is completed at step 11, it
>             detects its link partner as up.
> 	15) Repeats steps 9–11.
> 
>    - this cycle repeats indefinitely. To avoid this issue, IGC only uses
>      mmsv.pmac_enabled to track whether FPE is enabled or disabled.

Not that I couldn't eventually tolerate this workaround, but I figure
it's worth asking anyway. Isn't there a way to do an adapter reset
without losing link in the PHY (assuming it's the PHY that loses link)?
Is that a consequence of software design that's not careful enough, or
of hardware design? I assume the PHY is a discrete component. Avoiding
the need to retrigger auto-negotiation saves a few seconds of waiting,
so it's worth pursuing if it's possible at all.

> Co-developed-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
> Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
> Co-developed-by: Choong Yong Liang <yong.liang.choong@linux.intel.com>
> Signed-off-by: Choong Yong Liang <yong.liang.choong@linux.intel.com>
> Signed-off-by: Faizal Rahim <faizal.abdul.rahim@linux.intel.com>
> ---
> diff --git a/drivers/net/ethernet/intel/igc/igc_ethtool.c b/drivers/net/ethernet/intel/igc/igc_ethtool.c
> index 817838677817..41cb03816f92 100644
> --- a/drivers/net/ethernet/intel/igc/igc_ethtool.c
> +++ b/drivers/net/ethernet/intel/igc/igc_ethtool.c
> @@ -1781,6 +1782,23 @@ static int igc_ethtool_set_eee(struct net_device *netdev,
>  	return 0;
>  }
>  
> +static int igc_ethtool_set_mm(struct net_device *netdev,
> +			      struct ethtool_mm_cfg *cmd,
> +			      struct netlink_ext_ack *extack)
> +{
> +	struct igc_adapter *adapter = netdev_priv(netdev);
> +	struct fpe_t *fpe = &adapter->fpe;
> +
> +	ethtool_mmsv_set_mm(&fpe->mmsv, cmd);
> +
> +	if (fpe->mmsv.pmac_enabled)
> +		static_branch_enable(&igc_fpe_enabled);
> +	else
> +		static_branch_disable(&igc_fpe_enabled);

I don't think the API is correctly used here. If there are 2 igc cards
in the system and one gets an ethtool call with pmac_enabled=false, it
will break the static branch for the other card. You need to use
static_branch_inc() and static_branch_dec(), and react only on changes
(old fpe->mmsv.pmac_enabled ^ cmd->pmac_enabled).

> +
> +	return igc_tsn_offload_apply(adapter);
> +}
> +
>  static int igc_ethtool_get_link_ksettings(struct net_device *netdev,
>  					  struct ethtool_link_ksettings *cmd)
>  {
> diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
> index 44e4f925491f..396410522e01 100644
> --- a/drivers/net/ethernet/intel/igc/igc_main.c
> +++ b/drivers/net/ethernet/intel/igc/igc_main.c
> @@ -2618,6 +2618,15 @@ static int igc_clean_rx_irq(struct igc_q_vector *q_vector, const int budget)
>  			size -= IGC_TS_HDR_LEN;
>  		}
>  
> +		if (static_branch_unlikely(&igc_fpe_enabled) &&
> +		    igc_fpe_is_verify_or_response(rx_desc, size)) {

FWIW, the igc_fpe_enabled static branch is kernel-wide and does not
necessarily mean that FPE is enabled on _this_ NIC (that would still be
adapter->fpe.mmsv.pmac_enabled).

I don't think it's very safe to call ethtool_mmsv events when it doesn't
expect them (pmac_enabled = false), at least from a maintainance perspective
it would be easier not to do that.

I'm just ballparking this, but I think igc_fpe_is_verify_or_response() is
also just a bit more computationally expensive than just checking
adapter->fpe.mmsv.pmac_enabled for the "normal" case where FPE isn't
enabled, so it might make sense to add this as an extra check anyway.

> +			igc_fpe_lp_event_status(rx_desc, &adapter->fpe.mmsv);
> +			/* Advance the ring next-to-clean */
> +			igc_is_non_eop(rx_ring, rx_desc);
> +			cleaned_count++;
> +			continue;
> +		}
> +
>  		if (!skb) {
>  			xdp_init_buff(&ctx.xdp, truesize, &rx_ring->xdp_rxq);
>  			xdp_prepare_buff(&ctx.xdp, pktbuf - igc_rx_offset(rx_ring),
> diff --git a/drivers/net/ethernet/intel/igc/igc_tsn.c b/drivers/net/ethernet/intel/igc/igc_tsn.c
> index f0213cfce07d..52ff6f324d79 100644
> --- a/drivers/net/ethernet/intel/igc/igc_tsn.c
> +++ b/drivers/net/ethernet/intel/igc/igc_tsn.c
> @@ -1,10 +1,153 @@
>  // SPDX-License-Identifier: GPL-2.0
>  /* Copyright (c)  2019 Intel Corporation */
>  
> +#include <linux/kernel.h>
>  #include "igc.h"
> +#include "igc_base.h"
>  #include "igc_hw.h"
>  #include "igc_tsn.h"
>  
> +DEFINE_STATIC_KEY_FALSE(igc_fpe_enabled);
> +
> +static int igc_fpe_init_smd_frame(struct igc_ring *ring,
> +				  struct igc_tx_buffer *buffer,
> +				  struct sk_buff *skb)
> +{
> +	dma_addr_t dma = dma_map_single(ring->dev, skb->data, skb->len,
> +					DMA_TO_DEVICE);
> +
> +	if (dma_mapping_error(ring->dev, dma)) {
> +		netdev_err_once(ring->netdev, "Failed to map DMA for TX\n");
> +		return -ENOMEM;
> +	}
> +
> +	buffer->skb = skb;
> +	buffer->protocol = 0;
> +	buffer->bytecount = skb->len;
> +	buffer->gso_segs = 1;
> +	buffer->time_stamp = jiffies;
> +	dma_unmap_len_set(buffer, len, skb->len);
> +	dma_unmap_addr_set(buffer, dma, dma);
> +
> +	return 0;
> +}
> +
> +static int igc_fpe_init_tx_descriptor(struct igc_ring *ring,
> +				      struct sk_buff *skb, int type)
> +{
> +	struct igc_tx_buffer *buffer;
> +	union igc_adv_tx_desc *desc;
> +	u32 cmd_type, olinfo_status;
> +	int err;
> +
> +	if (!igc_desc_unused(ring))
> +		return -EBUSY;
> +
> +	if (type != IGC_TXD_POPTS_SMD_V && type != IGC_TXD_POPTS_SMD_R)
> +		return -EINVAL;

It should really be sufficient to guarantee at compile time that the argument
is valid (it never comes from user input, so it's not hard). Error handling
for a condition that can't be triggered shouldn't exist, except for the
very specific case of future proofing loosely coupled software components
(what one would call "API", not the case here).

I guess a concern is what happens when, for any reason, you add a new
type of packet that you can transmit, and igc_fpe_init_tx_descriptor()
isn't updated to handle it yet. If you make "int type" a proper enum
type, the compiler will automatically warn when an enum value isn't
handled in the "switch (type)" block, because there's no "default" case.
I think that's actually better than finding this out at runtime, so I
would recommend using an enum for stronger typing.

Warning: it's still only a hint and thus not perfect, in that you
could pass an out-of-range constant literal like "42" and the compiler
wouldn't warn.

> +
> +	buffer = &ring->tx_buffer_info[ring->next_to_use];
> +	err = igc_fpe_init_smd_frame(ring, buffer, skb);
> +	if (err)
> +		return err;
> +
> +	cmd_type = IGC_ADVTXD_DTYP_DATA | IGC_ADVTXD_DCMD_DEXT |
> +		   IGC_ADVTXD_DCMD_IFCS | IGC_TXD_DCMD |
> +		   buffer->bytecount;
> +	olinfo_status = buffer->bytecount << IGC_ADVTXD_PAYLEN_SHIFT;

Still some FIELD_PREP() conversions to do here?

> +
> +	switch (type) {
> +	case IGC_TXD_POPTS_SMD_V:
> +		olinfo_status |= (IGC_TXD_POPTS_SMD_V << IGC_TXD_POPTS_SMD_SHIFT);
> +		break;
> +	case IGC_TXD_POPTS_SMD_R:
> +		olinfo_status |= (IGC_TXD_POPTS_SMD_R << IGC_TXD_POPTS_SMD_SHIFT);
> +		break;
> +	}

What I just said above is a bit diluted by the fact that this doesn't
have to be a switch/case block. It's just "olinfo_status |= type << IGC_TXD_POPTS_SMD_SHIFT"
(also, parentheses are not necessary).

As a mere suggestion, I would write this as:

	switch (type) { // "type" being an enum
	case IGC_TXD_POPTS_SMD_V:
	case IGC_TXD_POPTS_SMD_R:
		olinfo_status |= FIELD_PREP(IGC_TXD_POPTS_SMD_MASK, type);
		break;
	}

> +
> +	desc = IGC_TX_DESC(ring, ring->next_to_use);
> +	desc->read.cmd_type_len = cpu_to_le32(cmd_type);
> +	desc->read.olinfo_status = cpu_to_le32(olinfo_status);
> +	desc->read.buffer_addr = cpu_to_le64(dma_unmap_addr(buffer, dma));
> +
> +	netdev_tx_sent_queue(txring_txq(ring), skb->len);
> +
> +	buffer->next_to_watch = desc;
> +	ring->next_to_use = (ring->next_to_use + 1) % ring->count;
> +
> +	return 0;
> +}
> +
> +bool igc_fpe_transmitted_smd_v(union igc_adv_tx_desc *tx_desc)
> +{
> +	u8 smd = FIELD_GET(IGC_TXD_POPTS_SMD_MASK, tx_desc->read.olinfo_status);
> +
> +	return smd == IGC_TXD_POPTS_SMD_V;
> +}
> +
> +static int igc_fpe_xmit_smd_frame(struct igc_adapter *adapter, int type)
> +{
> +	int cpu = smp_processor_id();
> +	struct netdev_queue *nq;
> +	struct igc_ring *ring;
> +	struct sk_buff *skb;
> +	void *data;
> +	int err;
> +
> +	if (!netif_running(adapter->netdev))
> +		return -ENOTCONN;

Have you seen a code path that would require protecting the transmission
using this conditional?

I get the feeling that this is defensive programming when we could give
guarantees in other ways. But maybe there are some race conditions I'm
not seeing.

ethtool_mmsv_send_mpacket() is called from ethtool_mmsv_verify_timer()
and from ethtool_mmsv_event_handle(event == ETHTOOL_MMSV_LP_SENT_VERIFY_MPACKET).

When called from ethtool_mmsv_verify_timer(), it means the verify timer
has been armed, which only happens if (netif_running(mmsv->dev)). After
netif_running() becomes false (aka __LINK_STATE_START is cleared, aka in
ndo_stop()), you have a chance to call ethtool_mmsv_stop(), which stops
the verification timer. I see you're not doing that. Is there a reason?

> +
> +	ring = igc_get_tx_ring(adapter, cpu);
> +	nq = txring_txq(ring);
> +
> +	skb = alloc_skb(SMD_FRAME_SIZE, GFP_KERNEL);
> +	if (!skb)
> +		return -ENOMEM;
> +
> +	data = skb_put(skb, SMD_FRAME_SIZE);
> +	memset(data, 0, SMD_FRAME_SIZE);

skb_put_zero()

> +
> +	__netif_tx_lock(nq, cpu);
> +
> +	err = igc_fpe_init_tx_descriptor(ring, skb, type);
> +	igc_flush_tx_descriptors(ring);
> +
> +	__netif_tx_unlock(nq);
> +
> +	return err;
> +}
> +
> +static void igc_fpe_send_mpacket(struct ethtool_mmsv *mmsv,
> +				 enum ethtool_mpacket type)
> +{
> +	struct fpe_t *fpe = container_of(mmsv, struct fpe_t, mmsv);
> +	struct igc_adapter *adapter;
> +	int err;
> +
> +	adapter = container_of(fpe, struct igc_adapter, fpe);
> +
> +	if (type == ETHTOOL_MPACKET_VERIFY) {
> +		err = igc_fpe_xmit_smd_frame(adapter, IGC_TXD_POPTS_SMD_V);
> +		if (err)
> +			netdev_err(adapter->netdev, "Error sending SMD-V\n");
> +	} else if (type == ETHTOOL_MPACKET_RESPONSE) {
> +		err = igc_fpe_xmit_smd_frame(adapter, IGC_TXD_POPTS_SMD_R);
> +		if (err)
> +			netdev_err(adapter->netdev, "Error sending SMD-R frame\n");

Can you please rate limit these prints?

> +	}
> +}

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH iwl-next v2 5/9] igc: Add support for frame preemption verification
  2025-02-05 17:12   ` Vladimir Oltean
@ 2025-02-06 14:40     ` Abdul Rahim, Faizal
  2025-02-06 15:04       ` Vladimir Oltean
  0 siblings, 1 reply; 14+ messages in thread
From: Abdul Rahim, Faizal @ 2025-02-06 14:40 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Tony Nguyen, Przemek Kitszel, Andrew Lunn, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Maxime Coquelin,
	Alexandre Torgue, Simon Horman, Russell King, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Russell King, Furong Xu, Serge Semin, Xiaolei Wang, Suraj Jaiswal,
	Kory Maincent, Petr Tesarik, Choong Yong Liang, Kunihiko Hayashi,
	Vinicius Costa Gomes, intel-wired-lan, netdev, linux-kernel,
	linux-stm32, linux-arm-kernel, bpf


Hi Vladimir,

Thanks for the quick review, appreciate your help.

On 6/2/2025 1:12 am, Vladimir Oltean wrote:
> On Wed, Feb 05, 2025 at 05:05:20AM -0500, Faizal Rahim wrote:
>> This patch implements the "ethtool --set-mm" callback to trigger the
>> frame preemption verification handshake.
>>
>> Uses the MAC Merge Software Verification (mmsv) mechanism in ethtool
>> to perform the verification handshake for igc.
>> The structure fpe.mmsv is set by mmsv in ethtool and should remain
>> read-only for the driver.
>>
>> igc does not use two mmsv callbacks:
>> a) configure_tx()
>>     - igc lacks registers to configure FPE in the transmit direction.
> 
> Yes, maybe, but it's still important to handle this. It tells you when
> the preemptible traffic classes should be sent as preemptible on the wire
> (i.e. when the verification is either disabled, or it succeeded).
> 
> There is a selftest called manual_failed_verification() which supposedly
> tests this exact condition: if verification fails, then packets sent to
> TC0 are supposed to bump the eMAC's TX counters, even though TC0 is
> configured as preemptible. Otherwise stated: even if the tc program says
> that a certain traffic class is preemptible, you don't want to actually
> send preemptible packets if you haven't verified the link partner can
> handle them, since it will likely drop them on RX otherwise.

Even though fpe in tx direction isn't set in igc, it still checks 
ethtool_mmsv_is_tx_active() before setting a queue as preemptible.

This is done in :
igc_tsn_enable_offload(struct igc_adapter *adapter) {
{
	....
	if (ethtool_mmsv_is_tx_active(&adapter->fpe.mmsv) &&
             ring->preemptible)
	    txqctl |= IGC_TXQCTL_PREEMPTIBLE;


Wouldn't this handle the situation mentioned ?
Sorry if I miss something here.

> The problem with that selftest is that it relies on the driver's ability
> to report split ethtool counters for eMAC/pMAC, rather than just aggregate.
> In lack of that, you need to know through some other mechanism whether
> the link partner received those packets as preemptible or express, and
> that's kind of hard to add to a selftest.
>
>> b) configure_pmac()
>>     - this callback dynamically controls pmac_enabled at runtime. For
>>       example, mmsv calls configure_pmac() and disables pmac_enabled when
>>       the link partner goes down, even if the user previously enabled it.
>>       The intention is to save power but it is not feasible in igc
>>       because it causes an endless adapter reset loop:
>>
>> 	 1) Board A and Board B complete the verification handshake.
>>              Tx mode register for both boards are in TSN mode.
>> 	 2) Board B link goes down.
>>
>>           On Board A:
>> 	 3) mmsv calls configure_pmac() with pmac_enabled = false.
>> 	 4) configure_pmac() in igc updates a new field based on
>>              pmac_enabled. Driver uses this field in igc_tsn_new_flags()
>>              to indicate that the user enabled/disabled FPE.
>> 	 5) configure_pmac() in igc calls igc_tsn_offload_apply() to check
>>              whether an adapter reset is needed. Calls existing logic in
>>              igc_tsn_will_tx_mode_change() and igc_tsn_new_flags().
>> 	 6) Since pmac_enabled is now disabled and no other TSN feature
>>              is active, igc_tsn_will_tx_mode_change() evaluates to true
>>              because Tx mode will switch from TSN to Legacy.
>> 	 7) Driver resets the adapter.
>> 	 8) Registers are set, and Tx mode switches to Legacy.
>> 	 9) When link partner is up, steps 3–8 repeat, but this time
>>              with pmac_enabled = true, reactivating TSN.
>>              igc_tsn_will_tx_mode_change() evaluates to true again,
>>              since Tx mode will switch from Legacy to TSN.
>> 	10) Driver resets the adapter.
>> 	11) Rest adapter completes, registers are set, and Tx mode
>> 	    switches to TSN.
>>
>>          On Board B:
>> 	12) Adapter reset on Board A at step 10 causes it to detect its
>>              link partner as down.
> 
> Is this using fiber/DAC, or twisted pair (RJ45 PHY)?

It's using RJ45 PHY.

>> 	13) Repeats steps 3–8.
>> 	14) Once reset adapter on Board A is completed at step 11, it
>>              detects its link partner as up.
>> 	15) Repeats steps 9–11.
>>
>>     - this cycle repeats indefinitely. To avoid this issue, IGC only uses
>>       mmsv.pmac_enabled to track whether FPE is enabled or disabled.
> 
> Not that I couldn't eventually tolerate this workaround, but I figure
> it's worth asking anyway. Isn't there a way to do an adapter reset
> without losing link in the PHY (assuming it's the PHY that loses link)?
> Is that a consequence of software design that's not careful enough, or
> of hardware design? I assume the PHY is a discrete component. Avoiding
> the need to retrigger auto-negotiation saves a few seconds of waiting,
> so it's worth pursuing if it's possible at all.
> 

I briefly checked the driver code and the i226 SW User Manual.

The code calls igc_reset_task() → igc_reset() → igc_reinit_locked() → 
igc_down() → igc_reset() → igc_power_down_phy_copper_base().

I suspect igc_power_down_phy_copper_base() contributes to the link loss, 
but there are likely other factors as well.

The SW User Manual states that a software reset (CTRL.DEV_RST) affects 
several components, including "Port Configuration Autoload from NVM, Data 
Path, On-die Memories, MAC, PCS, Auto Negotiation and other Port-related 
Logic, Function Queue Enable, Function Interrupt and Statistics registers, 
Wake-up Management Registers, and Memory Configuration Registers."

Given this, right now, I’m unsure about the feasibility of making this 
change, though I see the benefits mentioned.
Validating this would require significant exploration—i.e., investigating 
the code, running experiments, reviewing commit history, confirming 
hardware expectations from the SW manual, and consulting other hardware SMEs.

Resetting the adapter is a common mechanism in igc that relies on shared 
functions, which can be triggered in various scenarios. Modifying this 
behavior (if feasible) could introduce some risks and would likely require 
additional testing across those scenarios. Focusing on this right now might 
delay the current series, which is primarily aimed at enabling Qbu on i225/6.

Would it be alright if I explore this separately from this Qbu series?

>> Co-developed-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
>> Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
>> Co-developed-by: Choong Yong Liang <yong.liang.choong@linux.intel.com>
>> Signed-off-by: Choong Yong Liang <yong.liang.choong@linux.intel.com>
>> Signed-off-by: Faizal Rahim <faizal.abdul.rahim@linux.intel.com>
>> ---
>> diff --git a/drivers/net/ethernet/intel/igc/igc_ethtool.c b/drivers/net/ethernet/intel/igc/igc_ethtool.c
>> index 817838677817..41cb03816f92 100644
>> --- a/drivers/net/ethernet/intel/igc/igc_ethtool.c
>> +++ b/drivers/net/ethernet/intel/igc/igc_ethtool.c
>> @@ -1781,6 +1782,23 @@ static int igc_ethtool_set_eee(struct net_device *netdev,
>>   	return 0;
>>   }
>>   
>> +static int igc_ethtool_set_mm(struct net_device *netdev,
>> +			      struct ethtool_mm_cfg *cmd,
>> +			      struct netlink_ext_ack *extack)
>> +{
>> +	struct igc_adapter *adapter = netdev_priv(netdev);
>> +	struct fpe_t *fpe = &adapter->fpe;
>> +
>> +	ethtool_mmsv_set_mm(&fpe->mmsv, cmd);
>> +
>> +	if (fpe->mmsv.pmac_enabled)
>> +		static_branch_enable(&igc_fpe_enabled);
>> +	else
>> +		static_branch_disable(&igc_fpe_enabled);
> 
> I don't think the API is correctly used here. If there are 2 igc cards
> in the system and one gets an ethtool call with pmac_enabled=false, it
> will break the static branch for the other card. You need to use
> static_branch_inc() and static_branch_dec(), and react only on changes
> (old fpe->mmsv.pmac_enabled ^ cmd->pmac_enabled).
> 
Got it, thanks for the explanation, that makes sense, I'll make the change.

>> +
>> +	return igc_tsn_offload_apply(adapter);
>> +}
>> +
>>   static int igc_ethtool_get_link_ksettings(struct net_device *netdev,
>>   					  struct ethtool_link_ksettings *cmd)
>>   {
>> diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
>> index 44e4f925491f..396410522e01 100644
>> --- a/drivers/net/ethernet/intel/igc/igc_main.c
>> +++ b/drivers/net/ethernet/intel/igc/igc_main.c
>> @@ -2618,6 +2618,15 @@ static int igc_clean_rx_irq(struct igc_q_vector *q_vector, const int budget)
>>   			size -= IGC_TS_HDR_LEN;
>>   		}
>>   
>> +		if (static_branch_unlikely(&igc_fpe_enabled) &&
>> +		    igc_fpe_is_verify_or_response(rx_desc, size)) {
> 
> FWIW, the igc_fpe_enabled static branch is kernel-wide and does not
> necessarily mean that FPE is enabled on _this_ NIC (that would still be
> adapter->fpe.mmsv.pmac_enabled).
> 
> I don't think it's very safe to call ethtool_mmsv events when it doesn't
> expect them (pmac_enabled = false), at least from a maintainance perspective
> it would be easier not to do that.
> 
> I'm just ballparking this, but I think igc_fpe_is_verify_or_response() is
> also just a bit more computationally expensive than just checking
> adapter->fpe.mmsv.pmac_enabled for the "normal" case where FPE isn't
> enabled, so it might make sense to add this as an extra check anyway.
> 

Got it, will update.

>> +			igc_fpe_lp_event_status(rx_desc, &adapter->fpe.mmsv);
>> +			/* Advance the ring next-to-clean */
>> +			igc_is_non_eop(rx_ring, rx_desc);
>> +			cleaned_count++;
>> +			continue;
>> +		}
>> +
>>   		if (!skb) {
>>   			xdp_init_buff(&ctx.xdp, truesize, &rx_ring->xdp_rxq);
>>   			xdp_prepare_buff(&ctx.xdp, pktbuf - igc_rx_offset(rx_ring),
>> diff --git a/drivers/net/ethernet/intel/igc/igc_tsn.c b/drivers/net/ethernet/intel/igc/igc_tsn.c
>> index f0213cfce07d..52ff6f324d79 100644
>> --- a/drivers/net/ethernet/intel/igc/igc_tsn.c
>> +++ b/drivers/net/ethernet/intel/igc/igc_tsn.c
>> @@ -1,10 +1,153 @@
>>   // SPDX-License-Identifier: GPL-2.0
>>   /* Copyright (c)  2019 Intel Corporation */
>>   
>> +#include <linux/kernel.h>
>>   #include "igc.h"
>> +#include "igc_base.h"
>>   #include "igc_hw.h"
>>   #include "igc_tsn.h"
>>   
>> +DEFINE_STATIC_KEY_FALSE(igc_fpe_enabled);
>> +
>> +static int igc_fpe_init_smd_frame(struct igc_ring *ring,
>> +				  struct igc_tx_buffer *buffer,
>> +				  struct sk_buff *skb)
>> +{
>> +	dma_addr_t dma = dma_map_single(ring->dev, skb->data, skb->len,
>> +					DMA_TO_DEVICE);
>> +
>> +	if (dma_mapping_error(ring->dev, dma)) {
>> +		netdev_err_once(ring->netdev, "Failed to map DMA for TX\n");
>> +		return -ENOMEM;
>> +	}
>> +
>> +	buffer->skb = skb;
>> +	buffer->protocol = 0;
>> +	buffer->bytecount = skb->len;
>> +	buffer->gso_segs = 1;
>> +	buffer->time_stamp = jiffies;
>> +	dma_unmap_len_set(buffer, len, skb->len);
>> +	dma_unmap_addr_set(buffer, dma, dma);
>> +
>> +	return 0;
>> +}
>> +
>> +static int igc_fpe_init_tx_descriptor(struct igc_ring *ring,
>> +				      struct sk_buff *skb, int type)
>> +{
>> +	struct igc_tx_buffer *buffer;
>> +	union igc_adv_tx_desc *desc;
>> +	u32 cmd_type, olinfo_status;
>> +	int err;
>> +
>> +	if (!igc_desc_unused(ring))
>> +		return -EBUSY;
>> +
>> +	if (type != IGC_TXD_POPTS_SMD_V && type != IGC_TXD_POPTS_SMD_R)
>> +		return -EINVAL;
> 
> It should really be sufficient to guarantee at compile time that the argument
> is valid (it never comes from user input, so it's not hard). Error handling
> for a condition that can't be triggered shouldn't exist, except for the

Yea that error handling wasn't originally there, it was added due to some 
internal commments, even though it didn't feel right to me.

Will change according to your suggestion.

> very specific case of future proofing loosely coupled software components
> (what one would call "API", not the case here).
> 
> I guess a concern is what happens when, for any reason, you add a new
> type of packet that you can transmit, and igc_fpe_init_tx_descriptor()
> isn't updated to handle it yet. If you make "int type" a proper enum
> type, the compiler will automatically warn when an enum value isn't
> handled in the "switch (type)" block, because there's no "default" case.
> I think that's actually better than finding this out at runtime, so I
> would recommend using an enum for stronger typing.
> 
> Warning: it's still only a hint and thus not perfect, in that you
> could pass an out-of-range constant literal like "42" and the compiler
> wouldn't warn.
> 
>> +
>> +	buffer = &ring->tx_buffer_info[ring->next_to_use];
>> +	err = igc_fpe_init_smd_frame(ring, buffer, skb);
>> +	if (err)
>> +		return err;
>> +
>> +	cmd_type = IGC_ADVTXD_DTYP_DATA | IGC_ADVTXD_DCMD_DEXT |
>> +		   IGC_ADVTXD_DCMD_IFCS | IGC_TXD_DCMD |
>> +		   buffer->bytecount;
>> +	olinfo_status = buffer->bytecount << IGC_ADVTXD_PAYLEN_SHIFT;
> 
> Still some FIELD_PREP() conversions to do here?
> 
>> +
>> +	switch (type) {
>> +	case IGC_TXD_POPTS_SMD_V:
>> +		olinfo_status |= (IGC_TXD_POPTS_SMD_V << IGC_TXD_POPTS_SMD_SHIFT);
>> +		break;
>> +	case IGC_TXD_POPTS_SMD_R:
>> +		olinfo_status |= (IGC_TXD_POPTS_SMD_R << IGC_TXD_POPTS_SMD_SHIFT);
>> +		break;
>> +	}
> 
> What I just said above is a bit diluted by the fact that this doesn't
> have to be a switch/case block. It's just "olinfo_status |= type << IGC_TXD_POPTS_SMD_SHIFT"
> (also, parentheses are not necessary).
> 
> As a mere suggestion, I would write this as:
> 
> 	switch (type) { // "type" being an enum
> 	case IGC_TXD_POPTS_SMD_V:
> 	case IGC_TXD_POPTS_SMD_R:
> 		olinfo_status |= FIELD_PREP(IGC_TXD_POPTS_SMD_MASK, type);
> 		break;
> 	}
> 
>> +
>> +	desc = IGC_TX_DESC(ring, ring->next_to_use);
>> +	desc->read.cmd_type_len = cpu_to_le32(cmd_type);
>> +	desc->read.olinfo_status = cpu_to_le32(olinfo_status);
>> +	desc->read.buffer_addr = cpu_to_le64(dma_unmap_addr(buffer, dma));
>> +
>> +	netdev_tx_sent_queue(txring_txq(ring), skb->len);
>> +
>> +	buffer->next_to_watch = desc;
>> +	ring->next_to_use = (ring->next_to_use + 1) % ring->count;
>> +
>> +	return 0;
>> +}
>> +
>> +bool igc_fpe_transmitted_smd_v(union igc_adv_tx_desc *tx_desc)
>> +{
>> +	u8 smd = FIELD_GET(IGC_TXD_POPTS_SMD_MASK, tx_desc->read.olinfo_status);
>> +
>> +	return smd == IGC_TXD_POPTS_SMD_V;
>> +}
>> +
>> +static int igc_fpe_xmit_smd_frame(struct igc_adapter *adapter, int type)
>> +{
>> +	int cpu = smp_processor_id();
>> +	struct netdev_queue *nq;
>> +	struct igc_ring *ring;
>> +	struct sk_buff *skb;
>> +	void *data;
>> +	int err;
>> +
>> +	if (!netif_running(adapter->netdev))
>> +		return -ENOTCONN;
> 
> Have you seen a code path that would require protecting the transmission
> using this conditional?
> 

I checked, I'm not sure too. I'll check with Vinicius too later, but I 
think it's not needed already since this flow is triggered by mmsv that 
already checks if (netif_running(mmsv->dev)) - as per your explanation below.

> I get the feeling that this is defensive programming when we could give
> guarantees in other ways. But maybe there are some race conditions I'm
> not seeing.
> 
> ethtool_mmsv_send_mpacket() is called from ethtool_mmsv_verify_timer()
> and from ethtool_mmsv_event_handle(event == ETHTOOL_MMSV_LP_SENT_VERIFY_MPACKET).
> 
> When called from ethtool_mmsv_verify_timer(), it means the verify timer
> has been armed, which only happens if (netif_running(mmsv->dev)). After
> netif_running() becomes false (aka __LINK_STATE_START is cleared, aka in
> ndo_stop()), you have a chance to call ethtool_mmsv_stop(), which stops
> the verification timer. I see you're not doing that. Is there a reason?
> 

I missed this - ethtool_mmsv_stop(). Sorry for that, I'll update the code, 
thanks.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH iwl-next v2 5/9] igc: Add support for frame preemption verification
  2025-02-06 14:40     ` Abdul Rahim, Faizal
@ 2025-02-06 15:04       ` Vladimir Oltean
  2025-02-07 16:59         ` Abdul Rahim, Faizal
  0 siblings, 1 reply; 14+ messages in thread
From: Vladimir Oltean @ 2025-02-06 15:04 UTC (permalink / raw)
  To: Abdul Rahim, Faizal
  Cc: Tony Nguyen, Przemek Kitszel, Andrew Lunn, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Maxime Coquelin,
	Alexandre Torgue, Simon Horman, Russell King, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Russell King, Furong Xu, Serge Semin, Xiaolei Wang, Suraj Jaiswal,
	Kory Maincent, Petr Tesarik, Choong Yong Liang, Kunihiko Hayashi,
	Vinicius Costa Gomes, intel-wired-lan, netdev, linux-kernel,
	linux-stm32, linux-arm-kernel, bpf

On Thu, Feb 06, 2025 at 10:40:11PM +0800, Abdul Rahim, Faizal wrote:
> 
> Hi Vladimir,
> 
> Thanks for the quick review, appreciate your help.
> 
> On 6/2/2025 1:12 am, Vladimir Oltean wrote:
> > On Wed, Feb 05, 2025 at 05:05:20AM -0500, Faizal Rahim wrote:
> > > This patch implements the "ethtool --set-mm" callback to trigger the
> > > frame preemption verification handshake.
> > > 
> > > Uses the MAC Merge Software Verification (mmsv) mechanism in ethtool
> > > to perform the verification handshake for igc.
> > > The structure fpe.mmsv is set by mmsv in ethtool and should remain
> > > read-only for the driver.
> > > 
> > > igc does not use two mmsv callbacks:
> > > a) configure_tx()
> > >     - igc lacks registers to configure FPE in the transmit direction.
> > 
> > Yes, maybe, but it's still important to handle this. It tells you when
> > the preemptible traffic classes should be sent as preemptible on the wire
> > (i.e. when the verification is either disabled, or it succeeded).
> > 
> > There is a selftest called manual_failed_verification() which supposedly
> > tests this exact condition: if verification fails, then packets sent to
> > TC0 are supposed to bump the eMAC's TX counters, even though TC0 is
> > configured as preemptible. Otherwise stated: even if the tc program says
> > that a certain traffic class is preemptible, you don't want to actually
> > send preemptible packets if you haven't verified the link partner can
> > handle them, since it will likely drop them on RX otherwise.
> 
> Even though fpe in tx direction isn't set in igc, it still checks
> ethtool_mmsv_is_tx_active() before setting a queue as preemptible.
> 
> This is done in :
> igc_tsn_enable_offload(struct igc_adapter *adapter) {
> {
> 	....
> 	if (ethtool_mmsv_is_tx_active(&adapter->fpe.mmsv) &&
>             ring->preemptible)
> 	    txqctl |= IGC_TXQCTL_PREEMPTIBLE;
> 
> 
> Wouldn't this handle the situation mentioned ?
> Sorry if I miss something here.

And what if tx_active becomes true after you had already configured the
queues with tc (and the above check caused IGC_TXQCTL_PREEMPTIBLE to not
be set)? Shouldn't you set IGC_TXQCTL_PREEMPTIBLE now? Isn't
ethtool_mmsv_configure_tx() exactly the function that notifies you of
changes to tx_active, and hence, aren't you interested in setting up a
callback for it?

Or is igc_tsn_reset() -> igc_tsn_enable_offload() called through some
other path, after verification succeeds, that I'm not seeing? I don't
think so. Maybe coincidentally, but not guaranteed.

> I briefly checked the driver code and the i226 SW User Manual.
> 
> The code calls igc_reset_task() → igc_reset() → igc_reinit_locked() →
> igc_down() → igc_reset() → igc_power_down_phy_copper_base().
> 
> I suspect igc_power_down_phy_copper_base() contributes to the link loss, but
> there are likely other factors as well.
> 
> The SW User Manual states that a software reset (CTRL.DEV_RST) affects
> several components, including "Port Configuration Autoload from NVM, Data
> Path, On-die Memories, MAC, PCS, Auto Negotiation and other Port-related
> Logic, Function Queue Enable, Function Interrupt and Statistics registers,
> Wake-up Management Registers, and Memory Configuration Registers."
> 
> Given this, right now, I’m unsure about the feasibility of making this
> change, though I see the benefits mentioned.
> Validating this would require significant exploration—i.e., investigating
> the code, running experiments, reviewing commit history, confirming hardware
> expectations from the SW manual, and consulting other hardware SMEs.
> 
> Resetting the adapter is a common mechanism in igc that relies on shared
> functions, which can be triggered in various scenarios. Modifying this
> behavior (if feasible) could introduce some risks and would likely require
> additional testing across those scenarios. Focusing on this right now might
> delay the current series, which is primarily aimed at enabling Qbu on
> i225/6.
> 
> Would it be alright if I explore this separately from this Qbu series?

Ok - as I said, it's not as if I couldn't eventually tolerate the workaround
you chose. We'd be putting a dependency of this feature on some unrelated
thing with a high degree of uncertainty. For example, I asked if this is
fiber or a twisted pair PHY, because while for the copper PHY the issue
might be more tractable (clause 28 autoneg on the media side is completely
independent of what the host side NIC is doing - it may reset, it may do
whatever - unless the MAC provides a clock that is important for PHY
operation), I do wonder if anything at all could be done to avoid the
link partner seeing a link loss over fiber, because there, the connection
is direct PCS-to-PCS. If you have to reset, you have to reset, and then
pick up the pieces, I understand.

It's good you're committing to look into this in more depth, though.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH iwl-next v2 5/9] igc: Add support for frame preemption verification
  2025-02-06 15:04       ` Vladimir Oltean
@ 2025-02-07 16:59         ` Abdul Rahim, Faizal
  0 siblings, 0 replies; 14+ messages in thread
From: Abdul Rahim, Faizal @ 2025-02-07 16:59 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Tony Nguyen, Przemek Kitszel, Andrew Lunn, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Maxime Coquelin,
	Alexandre Torgue, Simon Horman, Russell King, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Russell King, Furong Xu, Serge Semin, Xiaolei Wang, Suraj Jaiswal,
	Kory Maincent, Petr Tesarik, Choong Yong Liang, Kunihiko Hayashi,
	Vinicius Costa Gomes, intel-wired-lan, netdev, linux-kernel,
	linux-stm32, linux-arm-kernel, bpf



On 6/2/2025 11:04 pm, Vladimir Oltean wrote:
> On Thu, Feb 06, 2025 at 10:40:11PM +0800, Abdul Rahim, Faizal wrote:
>>
>> Hi Vladimir,
>>
>> Thanks for the quick review, appreciate your help.
>>
>> On 6/2/2025 1:12 am, Vladimir Oltean wrote:
>>> On Wed, Feb 05, 2025 at 05:05:20AM -0500, Faizal Rahim wrote:
>>>> This patch implements the "ethtool --set-mm" callback to trigger the
>>>> frame preemption verification handshake.
>>>>
>>>> Uses the MAC Merge Software Verification (mmsv) mechanism in ethtool
>>>> to perform the verification handshake for igc.
>>>> The structure fpe.mmsv is set by mmsv in ethtool and should remain
>>>> read-only for the driver.
>>>>
>>>> igc does not use two mmsv callbacks:
>>>> a) configure_tx()
>>>>      - igc lacks registers to configure FPE in the transmit direction.
>>>
>>> Yes, maybe, but it's still important to handle this. It tells you when
>>> the preemptible traffic classes should be sent as preemptible on the wire
>>> (i.e. when the verification is either disabled, or it succeeded).
>>>
>>> There is a selftest called manual_failed_verification() which supposedly
>>> tests this exact condition: if verification fails, then packets sent to
>>> TC0 are supposed to bump the eMAC's TX counters, even though TC0 is
>>> configured as preemptible. Otherwise stated: even if the tc program says
>>> that a certain traffic class is preemptible, you don't want to actually
>>> send preemptible packets if you haven't verified the link partner can
>>> handle them, since it will likely drop them on RX otherwise.
>>
>> Even though fpe in tx direction isn't set in igc, it still checks
>> ethtool_mmsv_is_tx_active() before setting a queue as preemptible.
>>
>> This is done in :
>> igc_tsn_enable_offload(struct igc_adapter *adapter) {
>> {
>> 	....
>> 	if (ethtool_mmsv_is_tx_active(&adapter->fpe.mmsv) &&
>>              ring->preemptible)
>> 	    txqctl |= IGC_TXQCTL_PREEMPTIBLE;
>>
>>
>> Wouldn't this handle the situation mentioned ?
>> Sorry if I miss something here.
> 
> And what if tx_active becomes true after you had already configured the
> queues with tc (and the above check caused IGC_TXQCTL_PREEMPTIBLE to not
> be set)? Shouldn't you set IGC_TXQCTL_PREEMPTIBLE now? Isn't
> ethtool_mmsv_configure_tx() exactly the function that notifies you of
> changes to tx_active, and hence, aren't you interested in setting up a
> callback for it?
> 

Ahh okay, got it. I sent v3 that also included this update. Thanks!

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2025-02-07 16:59 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-05 10:05 [PATCH iwl-next v2 0/9] igc: Add support for Frame Preemption feature in IGC Faizal Rahim
2025-02-05 10:05 ` [PATCH iwl-next v2 1/9] net: ethtool: mm: extract stmmac verification logic into common library Faizal Rahim
2025-02-05 10:05 ` [PATCH iwl-next v2 2/9] igc: Rename xdp_get_tx_ring() for non-xdp usage Faizal Rahim
2025-02-05 10:05 ` [PATCH iwl-next v2 3/9] igc: Optimize the TX packet buffer utilization Faizal Rahim
2025-02-05 10:05 ` [PATCH iwl-next v2 4/9] igc: Set the RX packet buffer size for TSN mode Faizal Rahim
2025-02-05 10:05 ` [PATCH iwl-next v2 5/9] igc: Add support for frame preemption verification Faizal Rahim
2025-02-05 17:12   ` Vladimir Oltean
2025-02-06 14:40     ` Abdul Rahim, Faizal
2025-02-06 15:04       ` Vladimir Oltean
2025-02-07 16:59         ` Abdul Rahim, Faizal
2025-02-05 10:05 ` [PATCH iwl-next v2 6/9] igc: Add support to set tx-min-frag-size Faizal Rahim
2025-02-05 10:05 ` [PATCH iwl-next v2 7/9] igc: Add support for preemptible traffic class in taprio Faizal Rahim
2025-02-05 10:05 ` [PATCH iwl-next v2 8/9] igc: Add support to get MAC Merge data via ethtool Faizal Rahim
2025-02-05 10:05 ` [PATCH iwl-next v2 9/9] igc: Add support to get frame preemption statistics " Faizal Rahim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).