From: Paolo Abeni <pabeni@redhat.com>
To: Raju Rangoju <Raju.Rangoju@amd.com>, netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, kuba@kernel.org,
edumazet@google.com, davem@davemloft.net, andrew+netdev@lunn.ch,
horms@kernel.org, Shyam-sundar.S-k@amd.com
Subject: Re: [PATCH v3 net 2/3] amd-xgbe: prevent CRC errors during RX adaptation with AN disabled
Date: Tue, 10 Mar 2026 12:06:26 +0100 [thread overview]
Message-ID: <965eb715-4e15-4ee2-bb29-bcb523411869@redhat.com> (raw)
In-Reply-To: <20260306111629.1515676-3-Raju.Rangoju@amd.com>
On 3/6/26 12:16 PM, Raju Rangoju wrote:
> When operating in 10GBASE-KR mode with auto-negotiation disabled and RX
> adaptation enabled, CRC errors can occur during the RX adaptation
> process. This happens because the driver continues transmitting and
> receiving packets while adaptation is in progress.
>
> Fix this by stopping TX/RX immediately when the link goes down and RX
> adaptation needs to be re-triggered, and only re-enabling TX/RX after
> adaptation completes and the link is confirmed up. Introduce a flag to
> track whether TX/RX was disabled for adaptation so it can be restored
> correctly.
>
> This prevents packets from being transmitted or received during the RX
> adaptation window and avoids CRC errors from corrupted frames.
>
> The flag tracking the data path state is synchronized with hardware
> state in xgbe_start() to prevent stale state after device restarts.
> This ensures that after a restart cycle (where xgbe_stop disables
> TX/RX and xgbe_start re-enables them), the flag correctly reflects
> that the data path is active.
>
> Fixes: 4f3b20bfbb75 ("amd-xgbe: add support for rx-adaptation")
> Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com>
> ---
> Changes since v1:
> - Change the data_path_stopped flag to boolean type
> as it is only used as a true/false indicator.
> Changes since v2:
> - change the data_path_stopped flag to be cleared in phy_start() to
> ensure it is reset on device restart
>
> drivers/net/ethernet/amd/xgbe/xgbe-drv.c | 4 ++
> drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c | 63 ++++++++++++++++++++-
> drivers/net/ethernet/amd/xgbe/xgbe.h | 4 ++
> 3 files changed, 69 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
> index 8b79d88480db..39da2f811858 100644
> --- a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
> +++ b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
> @@ -1277,6 +1277,10 @@ static int xgbe_start(struct xgbe_prv_data *pdata)
>
> hw_if->enable_tx(pdata);
> hw_if->enable_rx(pdata);
> + /* Synchronize flag with hardware state after enabling TX/RX.
> + * This prevents stale state after device restart cycles.
> + */
> + pdata->data_path_stopped = false;
>
> udp_tunnel_nic_reset_ntf(netdev);
>
> diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c b/drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c
> index b330e888c944..59a074ed312a 100644
> --- a/drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c
> +++ b/drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c
> @@ -2017,6 +2017,48 @@ static void xgbe_phy_rx_adaptation(struct xgbe_prv_data *pdata)
> xgbe_rx_adaptation(pdata);
> }
>
> +/*
> + * xgbe_phy_stop_data_path - Stop TX/RX to prevent packet corruption
> + * @pdata: driver private data
> + *
> + * This function stops the data path (TX and RX) to prevent packet
> + * corruption during critical PHY operations like RX adaptation.
> + * Must be called before initiating RX adaptation when link goes down.
> + */
> +static void xgbe_phy_stop_data_path(struct xgbe_prv_data *pdata)
> +{
> + if (pdata->data_path_stopped)
> + return;
> +
> + /* Stop TX/RX to prevent packet corruption during RX adaptation */
> + pdata->hw_if.disable_tx(pdata);
> + pdata->hw_if.disable_rx(pdata);
> + pdata->data_path_stopped = true;
> +
> + netif_dbg(pdata, link, pdata->netdev,
> + "stopping data path for RX adaptation\n");
> +}
> +
> +/*
> + * xgbe_phy_start_data_path - Re-enable TX/RX after RX adaptation
> + * @pdata: driver private data
> + *
> + * This function re-enables the data path (TX and RX) after RX adaptation
> + * has completed successfully. Only called when link is confirmed up.
> + */
> +static void xgbe_phy_start_data_path(struct xgbe_prv_data *pdata)
> +{
> + if (!pdata->data_path_stopped)
> + return;
> +
> + pdata->hw_if.enable_rx(pdata);
> + pdata->hw_if.enable_tx(pdata);
> + pdata->data_path_stopped = false;
> +
> + netif_dbg(pdata, link, pdata->netdev,
> + "restarting data path after RX adaptation\n");
> +}
> +
> static void xgbe_phy_rx_reset(struct xgbe_prv_data *pdata)
> {
> int reg;
> @@ -2826,13 +2868,27 @@ static int xgbe_phy_link_status(struct xgbe_prv_data *pdata, int *an_restart)
> if (pdata->en_rx_adap) {
> /* if the link is available and adaptation is done,
> * declare link up
> + *
> + * Note: When link is up and adaptation is done, we can
> + * safely re-enable the data path if it was stopped
> + * for adaptation.
> */
> - if ((reg & MDIO_STAT1_LSTATUS) && pdata->rx_adapt_done)
> + if ((reg & MDIO_STAT1_LSTATUS) && pdata->rx_adapt_done) {
> + xgbe_phy_start_data_path(pdata);
> return 1;
> + }
> /* If either link is not available or adaptation is not done,
> * retrigger the adaptation logic. (if the mode is not set,
> * then issue mailbox command first)
> */
> +
> + /* CRITICAL: Stop data path BEFORE triggering RX adaptation
> + * to prevent CRC errors from packets corrupted during
> + * the adaptation process. This is especially important
> + * when AN is OFF in 10G KR mode.
> + */
> + xgbe_phy_stop_data_path(pdata);
> +
> if (pdata->mode_set) {
> xgbe_phy_rx_adaptation(pdata);
> } else {
> @@ -2840,8 +2896,11 @@ static int xgbe_phy_link_status(struct xgbe_prv_data *pdata, int *an_restart)
> xgbe_phy_set_mode(pdata, phy_data->cur_mode);
> }
>
> - if (pdata->rx_adapt_done)
> + if (pdata->rx_adapt_done) {
> + /* Adaptation complete, safe to re-enable data path */
> + xgbe_phy_start_data_path(pdata);
> return 1;
> + }
> } else if (reg & MDIO_STAT1_LSTATUS)
> return 1;
AI says:
---
What happens if en_rx_adap transitions from true to false while
data_path_stopped is true?
Consider this scenario during an SFP hot-swap from a passive cable (which
enables rx_adap) to an active cable (which disables rx_adap):
1. Initially with passive cable: en_rx_adap=true, link down
2. xgbe_phy_link_status() enters the if (pdata->en_rx_adap) block
3. xgbe_phy_stop_data_path() sets data_path_stopped=true and disables TX/RX
4. Hot-swap to active cable: xgbe_phy_sfi_mode() sets en_rx_adap=0
5. Next call to xgbe_phy_link_status(): en_rx_adap is now false
6. The entire if (pdata->en_rx_adap) block is skipped
7. Link is up, so falls through to: else if (reg & MDIO_STAT1_LSTATUS)
return 1
8. Returns link up, but TX/RX hardware remains disabled
At this point, data_path_stopped=true and the hardware TX/RX is disabled,
but there's no code path to re-enable it when en_rx_adap becomes false
(except a full device restart cycle through xgbe_stop + xgbe_start).
Would it be safer to check data_path_stopped in the else branch and call
xgbe_phy_start_data_path() if needed before returning link up?
---
I think the remark is not correct, as AFAICS xgbe_phy_sfi_mode() is only
invoked by xgbe_phy_link_status(), after checking `en_rx_adap` and ev.
re-starting the ring. Reporting the remark here in case I misread the
code and this actually need a follow-up.
/P
next prev parent reply other threads:[~2026-03-10 11:06 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-06 11:16 [PATCH v3 net 0/3] amd-xgbe: RX adaptation and PHY handling fixes Raju Rangoju
2026-03-06 11:16 ` [PATCH v3 net 1/3] amd-xgbe: fix link status handling in xgbe_rx_adaptation Raju Rangoju
2026-03-06 11:16 ` [PATCH v3 net 2/3] amd-xgbe: prevent CRC errors during RX adaptation with AN disabled Raju Rangoju
2026-03-10 11:06 ` Paolo Abeni [this message]
2026-03-06 11:16 ` [PATCH v3 net 3/3] amd-xgbe: reset PHY settings before starting PHY Raju Rangoju
2026-03-10 11:20 ` [PATCH v3 net 0/3] amd-xgbe: RX adaptation and PHY handling fixes patchwork-bot+netdevbpf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=965eb715-4e15-4ee2-bb29-bcb523411869@redhat.com \
--to=pabeni@redhat.com \
--cc=Raju.Rangoju@amd.com \
--cc=Shyam-sundar.S-k@amd.com \
--cc=andrew+netdev@lunn.ch \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox