Netdev List
 help / color / mirror / Atom feed
From: Jakub Kicinski <kuba@kernel.org>
To: javen_xu@realsil.com.cn
Cc: Jakub Kicinski <kuba@kernel.org>,
	hkallweit1@gmail.com, nic_swsd@realtek.com,
	andrew+netdev@lunn.ch, davem@davemloft.net, edumazet@google.com,
	pabeni@redhat.com, maxime.chevallier@bootlin.com,
	horms@kernel.org, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH net-next v2 4/6] r8169: add support for RTL8116af
Date: Sat, 13 Jun 2026 15:06:26 -0700	[thread overview]
Message-ID: <20260613220626.220220-1-kuba@kernel.org> (raw)
In-Reply-To: <20260611094345.880-5-javen_xu@realsil.com.cn>

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
---
r8169: add support for RTL8116af

This change adds support for RTL8116af, a SerDes/SFP variant of the
RTL_GIGA_MAC_VER_52 family that has no internal PHY. It introduces a
phylink_pcs implementation that reads link state directly from the
SerDes registers via OCP and skips r8169_mdio_register() so that no
phy_device is bound, with NULL guards added in several callers.

A few questions about the change below.

[Low]
The commit message says:

  Phylink uses pcs to get the link status and speed from its serdes
  reg, instead of standard phy reg.

Looking at rtl8169_pcs_get_state(), only link / an_complete / lpa
appear to come from the SerDes registers. The speed and duplex are
hardcoded:

    if (state->link) {
        state->speed = SPEED_1000;
        state->duplex = DUPLEX_FULL;
    }

Is that mismatch with the commit message intentional, given that for
1000BASE-X 1000FD is the only valid mode anyway?

> diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
> index 615bd4107359..a73c0215b240 100644
> --- a/drivers/net/ethernet/realtek/r8169_main.c
> +++ b/drivers/net/ethernet/realtek/r8169_main.c

[ ... ]

> @@ -3703,12 +3748,14 @@ static void rtl_hw_start_8117(struct rtl8169_private *tp)
>  
>  	rtl_pcie_state_l2l3_disable(tp);
>  
> -	rg_saw_cnt = phy_read_paged(tp->phydev, 0x0c42, 0x13) & 0x3fff;
> -	if (rg_saw_cnt > 0) {
> -		u16 sw_cnt_1ms_ini;
> +	if (tp->phydev) {
> +		rg_saw_cnt = phy_read_paged(tp->phydev, 0x0c42, 0x13) & 0x3fff;
> +		if (rg_saw_cnt > 0) {
> +			u16 sw_cnt_1ms_ini;
>  
> -		sw_cnt_1ms_ini = (16000000 / rg_saw_cnt) & 0x0fff;
> -		r8168_mac_ocp_modify(tp, 0xd412, 0x0fff, sw_cnt_1ms_ini);
> +			sw_cnt_1ms_ini = (16000000 / rg_saw_cnt) & 0x0fff;
> +			r8168_mac_ocp_modify(tp, 0xd412, 0x0fff, sw_cnt_1ms_ini);
> +		}
>  	}

[Medium]
rtl_hw_start_8117() is shared between RTL8117 and the new RTL8116af
since both are RTL_GIGA_MAC_VER_52. With this change the calibration
of MAC register 0xd412 from PHY register 0x0c42:0x13 is silently
skipped for 8116af, leaving 0xd412 at its reset value rather than a
calibrated 1ms count.

Does 8116af actually not need 0xd412 configured, or does it need an
alternative calibration source? The commit log does not mention either
case.

[ ... ]

> @@ -5647,6 +5696,10 @@ static void rtl_mac_link_up(struct phylink_config *config, struct phy_device *ph
>  static struct phylink_pcs *rtl_mac_select_pcs(struct phylink_config *config,
>  					      phy_interface_t interface)
>  {
> +	struct rtl8169_private *tp = container_of(config, struct rtl8169_private, phylink_config);
> +
> +	if (interface == PHY_INTERFACE_MODE_1000BASEX || interface == PHY_INTERFACE_MODE_SGMII)
> +		return &tp->pcs;
>  	return NULL;
>  }
>  
> @@ -5655,6 +5708,55 @@ static void rtl_mac_config(struct phylink_config *config, unsigned int mode,
>  {
>  }
>  
> +static u16 rtl8169_sds_read(struct rtl8169_private *tp, u16 sds_reg)
> +{
> +	r8168_mac_ocp_write(tp, OCP_SDS_ADDR_REG, sds_reg);
> +	r8168_mac_ocp_write(tp, OCP_SDS_CMD_REG, SDS_CMD_READ);
> +	return r8168_mac_ocp_read(tp, OCP_SDS_DATA_REG);
> +}

[Medium]
The three-step SDS sequence (write addr, write cmd, read data) is
implemented with three separate r8168_mac_ocp_*() calls, each of which
takes and releases tp->mac_ocp_lock independently.

rtl8169_pcs_get_state() can run from phylink_resolve() (workqueue) and
also from the ethtool/mii ioctl paths, e.g.:

    phylink_ethtool_ksettings_get()
      -> phylink_mac_pcs_get_state()
        -> pcs->ops->pcs_get_state()

These contexts share no common lock. Can two SDS transactions
interleave between the address/cmd writes and the data read, so the
engine sees an addr from one caller and a cmd from another and returns
data for the wrong register?

The existing r8168_mac_ocp_modify() uses a single
raw_spin_lock_irqsave around __r8168_mac_ocp_read/write to keep its
RMW atomic. Should this SDS helper do the same across the three
accesses?

[Medium]
After writing SDS_CMD_READ (0x0001) to OCP_SDS_CMD_REG, the data
register is read immediately without polling for completion. Other
indirect accessors in this driver follow a kick-then-poll pattern, for
example r8168_phy_ocp_read():

    RTL_W32(tp, GPHY_OCP, reg << 15);
    return rtl_loop_wait_high(tp, &rtl_ocp_gphy_cond, 25, 10) ?
        (RTL_R32(tp, GPHY_OCP) & 0xffff) : -ETIMEDOUT;

and rtl_eri_read, rtl_efuse_read, rtl_ephy_read, rtl_csi_read and
r8169_mdio_read_reg all wait on a completion bit before reading the
data register.

Is the SerDes engine guaranteed to complete synchronously here, or can
the read of OCP_SDS_DATA_REG return stale data from the previous
transaction, leading to wrong BMSR / LPA values being reported back to
phylink?

> +
> +static void rtl8169_pcs_get_state(struct phylink_pcs *pcs,
> +				  unsigned int neg_mode,
> +				  struct phylink_link_state *state)
> +{
> +	struct rtl8169_private *tp = container_of(pcs, struct rtl8169_private, pcs);
> +	u16 bmsr, lpa;
> +
> +	bmsr = rtl8169_sds_read(tp, RTL_SDS_C22_BASE + MII_BMSR);
> +	lpa = rtl8169_sds_read(tp, RTL_SDS_C22_BASE + MII_LPA);
> +
> +	state->link = !!(bmsr & BMSR_LSTATUS);
> +	state->an_complete = !!(bmsr & BMSR_ANEGCOMPLETE);
> +	if (state->link) {
> +		state->speed = SPEED_1000;
> +		state->duplex = DUPLEX_FULL;
> +	} else {
> +		state->speed = SPEED_UNKNOWN;
> +		state->duplex = DUPLEX_UNKNOWN;
> +	}
> +
> +	if (lpa & LPA_1000XPAUSE)
> +		state->pause |= MLO_PAUSE_RX | MLO_PAUSE_TX;
> +}

[Low]
The 1000BASE-X LPA encodes pause as two bits, LPA_1000XPAUSE and
LPA_1000XPAUSE_ASYM, which together describe four states (none, sym,
asym TX, asym RX). Treating any peer that sets LPA_1000XPAUSE as full
bidirectional pause looks like it ignores the asymmetric cases.

Could this use phylink_mii_c22_pcs_decode_state() (or equivalent
decoding logic) so both LPA pause bits are interpreted correctly?

> +
> +static int rtl8169_pcs_config(struct phylink_pcs *pcs, unsigned int mode,
> +			      phy_interface_t interface,
> +			      const unsigned long *advertising,
> +			      bool permit_pause_to_mac)
> +{
> +	return 0;
> +}
> +
> +static int rtl8169_pcs_validate(struct phylink_pcs *pcs, unsigned long *supported,
> +				const struct phylink_link_state *state)
> +{
> +	return 0;
> +}
> +
> +static void rtl8169_pcs_an_restart(struct phylink_pcs *pcs)
> +{
> +}

[Medium]
pcs_config and pcs_an_restart are both empty stubs. phylink calls
pcs_an_restart from phylink_ethtool_nway_reset(), so an
ethtool -r ethN on an 8116af interface will return success but will
not actually trigger AN restart in hardware. Similarly, pcs_config is
called whenever the inband advertisement changes, so changes via
ethtool -s ... advertise ... are silently dropped.

There does not appear to be any other code path that programs the
SerDes for 1000BASE-X mode or AN advertisement, so the chip seems to
rely on the bootloader leaving the PCS pre-configured.

Should pcs_an_restart at least write BMCR_ANRESTART to the SerDes
BMCR, and pcs_config program the advertisement?

[ ... ]

> @@ -5828,12 +5947,7 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
>  	}
>  	tp->aspm_manageable = !rc;
>  
> -	if (rtl_is_8125(tp)) {
> -		u16 data = r8168_mac_ocp_read(tp, 0xd006);
> -
> -		if ((data & 0xff) == 0x07)
> -			tp->sfp_mode = true;
> -	}
> +	tp->sfp_mode = rtl_get_sfp_mode(tp);

[Low]
After detection as RTL8116af, chip still resolves to the
RTL_GIGA_MAC_VER_52 entry in rtl_chip_infos[] whose name is
"RTL8168fp/RTL8117". The probe banner

    netdev_info(dev, "%s, %pM, %sXID %x, IRQ %d\n",
                chip->name, ...);

therefore identifies an 8116af device as RTL8168fp/RTL8117. Should
chip->name be overridden when rtl_get_sfp_mode() returns
RTL_SFP_8168_AF so that dmesg reports the actual variant?

[ ... ]

> @@ -5941,10 +6055,12 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
>  	if (rc)
>  		return rc;
>  
> -	rc = r8169_mdio_register(tp);
> -	if (rc) {
> -		phylink_destroy(tp->phylink);
> -		return rc;
> +	if (tp->sfp_mode != RTL_SFP_8168_AF) {
> +		rc = r8169_mdio_register(tp);
> +		if (rc) {
> +			phylink_destroy(tp->phylink);
> +			return rc;
> +		}
>  	}

[High]
r8169_mdio_register() is the only function that initializes
tp->phy_irq_domain via irq_domain_add_linear(). Skipping it for
RTL_SFP_8168_AF leaves tp->phy_irq_domain as NULL.

rtl_set_irq_mask() is unchanged, so the LinkChg bit remains set in
tp->irq_mask. On a SerDes link transition, rtl8169_interrupt() still
runs:

    generic_handle_domain_irq(tp->phy_irq_domain, 0);

with tp->phy_irq_domain == NULL. Inside __irq_resolve_mapping() this
falls back to irq_default_domain, which on some architectures will
look up hwirq 0 in an unrelated domain, while on others
handle_irq_desc(NULL) just returns -EINVAL silently.

Since pcs.poll = true is already set for 8168_AF, link state is
polled via phylink anyway. Should LinkChg be masked out of
tp->irq_mask for 8168_AF, or should the generic_handle_domain_irq()
call be guarded by tp->phy_irq_domain (or tp->phydev)?

  parent reply	other threads:[~2026-06-13 22:06 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-11  9:43 [PATCH net-next v2 0/6] r8169: add support for phylink javen
2026-06-11  9:43 ` [PATCH net-next v2 1/6] r8169: add speed in private struct javen
2026-06-11  9:43 ` [PATCH net-next v2 2/6] r8169: create a virtual interrupt for linkchg javen
2026-06-13 22:05   ` Jakub Kicinski
2026-06-13 22:06   ` Jakub Kicinski
2026-06-11  9:43 ` [PATCH net-next v2 3/6] r8169: add support for phylink javen
2026-06-12  8:13   ` Maxime Chevallier
2026-06-13 22:05   ` Jakub Kicinski
2026-06-13 22:06   ` Jakub Kicinski
2026-06-11  9:43 ` [PATCH net-next v2 4/6] r8169: add support for RTL8116af javen
2026-06-13 22:06   ` Jakub Kicinski
2026-06-13 22:06   ` Jakub Kicinski [this message]
2026-06-11  9:43 ` [PATCH net-next v2 5/6] r8169: add ltr " javen
2026-06-13 22:06   ` Jakub Kicinski
2026-06-13 22:06   ` Jakub Kicinski
2026-06-11  9:43 ` [PATCH net-next v2 6/6] r8169: fix RTL8116af can not enter s0idle and c10 javen
2026-06-13 22:06   ` Jakub Kicinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260613220626.220220-1-kuba@kernel.org \
    --to=kuba@kernel.org \
    --cc=andrew+netdev@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=hkallweit1@gmail.com \
    --cc=horms@kernel.org \
    --cc=javen_xu@realsil.com.cn \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maxime.chevallier@bootlin.com \
    --cc=netdev@vger.kernel.org \
    --cc=nic_swsd@realtek.com \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox