From: Jakub Kicinski <kuba@kernel.org>
To: javen_xu@realsil.com.cn
Cc: Jakub Kicinski <kuba@kernel.org>,
hkallweit1@gmail.com, nic_swsd@realtek.com,
andrew+netdev@lunn.ch, davem@davemloft.net, edumazet@google.com,
pabeni@redhat.com, maxime.chevallier@bootlin.com,
horms@kernel.org, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH net-next v2 4/6] r8169: add support for RTL8116af
Date: Sat, 13 Jun 2026 15:06:26 -0700 [thread overview]
Message-ID: <20260613220626.220220-1-kuba@kernel.org> (raw)
In-Reply-To: <20260611094345.880-5-javen_xu@realsil.com.cn>
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
---
r8169: add support for RTL8116af
This change adds support for RTL8116af, a SerDes/SFP variant of the
RTL_GIGA_MAC_VER_52 family that has no internal PHY. It introduces a
phylink_pcs implementation that reads link state directly from the
SerDes registers via OCP and skips r8169_mdio_register() so that no
phy_device is bound, with NULL guards added in several callers.
A few questions about the change below.
[Low]
The commit message says:
Phylink uses pcs to get the link status and speed from its serdes
reg, instead of standard phy reg.
Looking at rtl8169_pcs_get_state(), only link / an_complete / lpa
appear to come from the SerDes registers. The speed and duplex are
hardcoded:
if (state->link) {
state->speed = SPEED_1000;
state->duplex = DUPLEX_FULL;
}
Is that mismatch with the commit message intentional, given that for
1000BASE-X 1000FD is the only valid mode anyway?
> diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
> index 615bd4107359..a73c0215b240 100644
> --- a/drivers/net/ethernet/realtek/r8169_main.c
> +++ b/drivers/net/ethernet/realtek/r8169_main.c
[ ... ]
> @@ -3703,12 +3748,14 @@ static void rtl_hw_start_8117(struct rtl8169_private *tp)
>
> rtl_pcie_state_l2l3_disable(tp);
>
> - rg_saw_cnt = phy_read_paged(tp->phydev, 0x0c42, 0x13) & 0x3fff;
> - if (rg_saw_cnt > 0) {
> - u16 sw_cnt_1ms_ini;
> + if (tp->phydev) {
> + rg_saw_cnt = phy_read_paged(tp->phydev, 0x0c42, 0x13) & 0x3fff;
> + if (rg_saw_cnt > 0) {
> + u16 sw_cnt_1ms_ini;
>
> - sw_cnt_1ms_ini = (16000000 / rg_saw_cnt) & 0x0fff;
> - r8168_mac_ocp_modify(tp, 0xd412, 0x0fff, sw_cnt_1ms_ini);
> + sw_cnt_1ms_ini = (16000000 / rg_saw_cnt) & 0x0fff;
> + r8168_mac_ocp_modify(tp, 0xd412, 0x0fff, sw_cnt_1ms_ini);
> + }
> }
[Medium]
rtl_hw_start_8117() is shared between RTL8117 and the new RTL8116af
since both are RTL_GIGA_MAC_VER_52. With this change the calibration
of MAC register 0xd412 from PHY register 0x0c42:0x13 is silently
skipped for 8116af, leaving 0xd412 at its reset value rather than a
calibrated 1ms count.
Does 8116af actually not need 0xd412 configured, or does it need an
alternative calibration source? The commit log does not mention either
case.
[ ... ]
> @@ -5647,6 +5696,10 @@ static void rtl_mac_link_up(struct phylink_config *config, struct phy_device *ph
> static struct phylink_pcs *rtl_mac_select_pcs(struct phylink_config *config,
> phy_interface_t interface)
> {
> + struct rtl8169_private *tp = container_of(config, struct rtl8169_private, phylink_config);
> +
> + if (interface == PHY_INTERFACE_MODE_1000BASEX || interface == PHY_INTERFACE_MODE_SGMII)
> + return &tp->pcs;
> return NULL;
> }
>
> @@ -5655,6 +5708,55 @@ static void rtl_mac_config(struct phylink_config *config, unsigned int mode,
> {
> }
>
> +static u16 rtl8169_sds_read(struct rtl8169_private *tp, u16 sds_reg)
> +{
> + r8168_mac_ocp_write(tp, OCP_SDS_ADDR_REG, sds_reg);
> + r8168_mac_ocp_write(tp, OCP_SDS_CMD_REG, SDS_CMD_READ);
> + return r8168_mac_ocp_read(tp, OCP_SDS_DATA_REG);
> +}
[Medium]
The three-step SDS sequence (write addr, write cmd, read data) is
implemented with three separate r8168_mac_ocp_*() calls, each of which
takes and releases tp->mac_ocp_lock independently.
rtl8169_pcs_get_state() can run from phylink_resolve() (workqueue) and
also from the ethtool/mii ioctl paths, e.g.:
phylink_ethtool_ksettings_get()
-> phylink_mac_pcs_get_state()
-> pcs->ops->pcs_get_state()
These contexts share no common lock. Can two SDS transactions
interleave between the address/cmd writes and the data read, so the
engine sees an addr from one caller and a cmd from another and returns
data for the wrong register?
The existing r8168_mac_ocp_modify() uses a single
raw_spin_lock_irqsave around __r8168_mac_ocp_read/write to keep its
RMW atomic. Should this SDS helper do the same across the three
accesses?
[Medium]
After writing SDS_CMD_READ (0x0001) to OCP_SDS_CMD_REG, the data
register is read immediately without polling for completion. Other
indirect accessors in this driver follow a kick-then-poll pattern, for
example r8168_phy_ocp_read():
RTL_W32(tp, GPHY_OCP, reg << 15);
return rtl_loop_wait_high(tp, &rtl_ocp_gphy_cond, 25, 10) ?
(RTL_R32(tp, GPHY_OCP) & 0xffff) : -ETIMEDOUT;
and rtl_eri_read, rtl_efuse_read, rtl_ephy_read, rtl_csi_read and
r8169_mdio_read_reg all wait on a completion bit before reading the
data register.
Is the SerDes engine guaranteed to complete synchronously here, or can
the read of OCP_SDS_DATA_REG return stale data from the previous
transaction, leading to wrong BMSR / LPA values being reported back to
phylink?
> +
> +static void rtl8169_pcs_get_state(struct phylink_pcs *pcs,
> + unsigned int neg_mode,
> + struct phylink_link_state *state)
> +{
> + struct rtl8169_private *tp = container_of(pcs, struct rtl8169_private, pcs);
> + u16 bmsr, lpa;
> +
> + bmsr = rtl8169_sds_read(tp, RTL_SDS_C22_BASE + MII_BMSR);
> + lpa = rtl8169_sds_read(tp, RTL_SDS_C22_BASE + MII_LPA);
> +
> + state->link = !!(bmsr & BMSR_LSTATUS);
> + state->an_complete = !!(bmsr & BMSR_ANEGCOMPLETE);
> + if (state->link) {
> + state->speed = SPEED_1000;
> + state->duplex = DUPLEX_FULL;
> + } else {
> + state->speed = SPEED_UNKNOWN;
> + state->duplex = DUPLEX_UNKNOWN;
> + }
> +
> + if (lpa & LPA_1000XPAUSE)
> + state->pause |= MLO_PAUSE_RX | MLO_PAUSE_TX;
> +}
[Low]
The 1000BASE-X LPA encodes pause as two bits, LPA_1000XPAUSE and
LPA_1000XPAUSE_ASYM, which together describe four states (none, sym,
asym TX, asym RX). Treating any peer that sets LPA_1000XPAUSE as full
bidirectional pause looks like it ignores the asymmetric cases.
Could this use phylink_mii_c22_pcs_decode_state() (or equivalent
decoding logic) so both LPA pause bits are interpreted correctly?
> +
> +static int rtl8169_pcs_config(struct phylink_pcs *pcs, unsigned int mode,
> + phy_interface_t interface,
> + const unsigned long *advertising,
> + bool permit_pause_to_mac)
> +{
> + return 0;
> +}
> +
> +static int rtl8169_pcs_validate(struct phylink_pcs *pcs, unsigned long *supported,
> + const struct phylink_link_state *state)
> +{
> + return 0;
> +}
> +
> +static void rtl8169_pcs_an_restart(struct phylink_pcs *pcs)
> +{
> +}
[Medium]
pcs_config and pcs_an_restart are both empty stubs. phylink calls
pcs_an_restart from phylink_ethtool_nway_reset(), so an
ethtool -r ethN on an 8116af interface will return success but will
not actually trigger AN restart in hardware. Similarly, pcs_config is
called whenever the inband advertisement changes, so changes via
ethtool -s ... advertise ... are silently dropped.
There does not appear to be any other code path that programs the
SerDes for 1000BASE-X mode or AN advertisement, so the chip seems to
rely on the bootloader leaving the PCS pre-configured.
Should pcs_an_restart at least write BMCR_ANRESTART to the SerDes
BMCR, and pcs_config program the advertisement?
[ ... ]
> @@ -5828,12 +5947,7 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
> }
> tp->aspm_manageable = !rc;
>
> - if (rtl_is_8125(tp)) {
> - u16 data = r8168_mac_ocp_read(tp, 0xd006);
> -
> - if ((data & 0xff) == 0x07)
> - tp->sfp_mode = true;
> - }
> + tp->sfp_mode = rtl_get_sfp_mode(tp);
[Low]
After detection as RTL8116af, chip still resolves to the
RTL_GIGA_MAC_VER_52 entry in rtl_chip_infos[] whose name is
"RTL8168fp/RTL8117". The probe banner
netdev_info(dev, "%s, %pM, %sXID %x, IRQ %d\n",
chip->name, ...);
therefore identifies an 8116af device as RTL8168fp/RTL8117. Should
chip->name be overridden when rtl_get_sfp_mode() returns
RTL_SFP_8168_AF so that dmesg reports the actual variant?
[ ... ]
> @@ -5941,10 +6055,12 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
> if (rc)
> return rc;
>
> - rc = r8169_mdio_register(tp);
> - if (rc) {
> - phylink_destroy(tp->phylink);
> - return rc;
> + if (tp->sfp_mode != RTL_SFP_8168_AF) {
> + rc = r8169_mdio_register(tp);
> + if (rc) {
> + phylink_destroy(tp->phylink);
> + return rc;
> + }
> }
[High]
r8169_mdio_register() is the only function that initializes
tp->phy_irq_domain via irq_domain_add_linear(). Skipping it for
RTL_SFP_8168_AF leaves tp->phy_irq_domain as NULL.
rtl_set_irq_mask() is unchanged, so the LinkChg bit remains set in
tp->irq_mask. On a SerDes link transition, rtl8169_interrupt() still
runs:
generic_handle_domain_irq(tp->phy_irq_domain, 0);
with tp->phy_irq_domain == NULL. Inside __irq_resolve_mapping() this
falls back to irq_default_domain, which on some architectures will
look up hwirq 0 in an unrelated domain, while on others
handle_irq_desc(NULL) just returns -EINVAL silently.
Since pcs.poll = true is already set for 8168_AF, link state is
polled via phylink anyway. Should LinkChg be masked out of
tp->irq_mask for 8168_AF, or should the generic_handle_domain_irq()
call be guarded by tp->phy_irq_domain (or tp->phydev)?
next prev parent reply other threads:[~2026-06-13 22:06 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-11 9:43 [PATCH net-next v2 0/6] r8169: add support for phylink javen
2026-06-11 9:43 ` [PATCH net-next v2 1/6] r8169: add speed in private struct javen
2026-06-11 9:43 ` [PATCH net-next v2 2/6] r8169: create a virtual interrupt for linkchg javen
2026-06-13 22:05 ` Jakub Kicinski
2026-06-13 22:06 ` Jakub Kicinski
2026-06-11 9:43 ` [PATCH net-next v2 3/6] r8169: add support for phylink javen
2026-06-12 8:13 ` Maxime Chevallier
2026-06-13 22:05 ` Jakub Kicinski
2026-06-13 22:06 ` Jakub Kicinski
2026-06-11 9:43 ` [PATCH net-next v2 4/6] r8169: add support for RTL8116af javen
2026-06-13 22:06 ` Jakub Kicinski
2026-06-13 22:06 ` Jakub Kicinski [this message]
2026-06-11 9:43 ` [PATCH net-next v2 5/6] r8169: add ltr " javen
2026-06-13 22:06 ` Jakub Kicinski
2026-06-13 22:06 ` Jakub Kicinski
2026-06-11 9:43 ` [PATCH net-next v2 6/6] r8169: fix RTL8116af can not enter s0idle and c10 javen
2026-06-13 22:06 ` Jakub Kicinski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260613220626.220220-1-kuba@kernel.org \
--to=kuba@kernel.org \
--cc=andrew+netdev@lunn.ch \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=hkallweit1@gmail.com \
--cc=horms@kernel.org \
--cc=javen_xu@realsil.com.cn \
--cc=linux-kernel@vger.kernel.org \
--cc=maxime.chevallier@bootlin.com \
--cc=netdev@vger.kernel.org \
--cc=nic_swsd@realtek.com \
--cc=pabeni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox