From: Daniel Golle <daniel@makrotopia.org>
To: Andrew Lunn <andrew@lunn.ch>
Cc: Vladimir Oltean <olteanv@gmail.com>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
Frank Wunderlich <frankwu@gmx.de>, Chad Monroe <chad@monroe.io>,
Cezary Wilmanski <cezary.wilmanski@adtran.com>,
Liang Xu <lxu@maxlinear.com>,
"Benny (Ying-Tsan) Weng" <yweng@maxlinear.com>,
Jose Maria Verdu Munoz <jverdu@maxlinear.com>,
Avinash Jayaraman <ajayaraman@maxlinear.com>,
John Crispin <john@phrozen.org>
Subject: Re: [PATCH net-next 1/2] net: dsa: mxl862xx: add CRC for MDIO communication
Date: Sat, 21 Mar 2026 15:53:09 +0000 [thread overview]
Message-ID: <ab6-5XnC6Z3TdytJ@makrotopia.org> (raw)
In-Reply-To: <c5f29b9b-9447-488e-b8cc-e2f7e7cbf1a8@lunn.ch>
On Sat, Mar 21, 2026 at 04:24:06PM +0100, Andrew Lunn wrote:
> > +/* Firmware CRC error codes (outside normal Zephyr errno range). */
> > +#define MXL862XX_FW_CRC6_ERR (-1024)
> > +#define MXL862XX_FW_CRC16_ERR (-1023)
> > +
> > +/* 3GPP CRC-6 lookup table (polynomial 0x6F).
> > + * Matches the firmware's default CRC-6 implementation.
> > + */
> > +static const u8 mxl862xx_crc6_table[256] = {
> > + 0x00, 0x2f, 0x31, 0x1e, 0x0d, 0x22, 0x3c, 0x13,
> > + 0x1a, 0x35, 0x2b, 0x04, 0x17, 0x38, 0x26, 0x09,
>
> I had a quick look around and count not find this table anywhere
> else. Mellanox has a different CRC-6 polynomial. Until somebody else
> needs this, i think it is O.K. here. But sometime in the future it
> might be moved into lib/crc.
>
> > + ret = mxl862xx_crc6_verify(ctrl_enc, len_enc, &fw_result);
> > + if (ret < 0) {
> > + dev_err(&priv->mdiodev->dev,
> > + "CRC-6 error on response to CMD %04x\n", cmd);
> > + return -EIO;
> > + }
>
> This is the question, what to do when you see a checksum failure? The
> basic assumption is MDIO is reliable. PHYs don't do any sort of
> checksums, nor any other switches using MDIO. Why would you want
> checksums?
>
> To detect the hardware is broken? If so, is returning EIO sufficient?
> Would it not be better to admin down all the interfaces?
>
> To allow the MDIO clock to run at a higher frequency, at the limit of
> the bus, so you get occasionally failures? If so, should you not
> retry? But are the commands idempotent? Can you safely retry?
>
Your guesses are all correct, and your concerns are justified.
Let me explain the whole picture:
The switch driver transfers many rather large data structures over
MDIO, and lacking support for interrupts (firmware doesn't support),
this is often even interleaved with polling 8 PHYs and at least one
PCS. To not have things get very slow (and then even sometimes see
-ETIMEDOUT breaking the PHY state machine if timing is unlucky), it
is common to overclock the MDIO bus to 20 MHz instead of the 2.5 MHz
default (the switch claims to support up to 25 MHz, but 20 Mhz is
sufficient and conservative enough).
That, combined with higher temperature of the switch IC (but still
within the spec'ed range), can lead to bit-errors -- which, in case
they remain unnoticed can introduce subtle (security relevant) issues
such as bridging ports which should not be bridged or flooding to a
port on which flooding should be disabled.
At this point I just wanted to make errors visible at all.
In case of the switch reporting back a CRC error for data received,
a limited number of retries would be ok in every case.
However, the same is not true for the opposite direction, ie. an
error detected by the Linux host for data received from the switch:
In case one of the *_ALLOC API calls we cannot simply repeat the
call, and as the data was corrupted, we wouldn't even know how to
undo the failed call. Resetting the switch and replaying the
complete state would be the only way to avoid a firmware resource
leak in this case.
Setting all interfaces to admin-down is probably the best compromise
in a case like this, as it would also reduce the energy consumption
and hence heat emission of the IC (as all built-in PHYs are then down;
that's where most of the heat comes from) and prevent damage -- I've
only observed CRC errors with the heatsink removed and artifically
overheating the IC...
next prev parent reply other threads:[~2026-03-21 15:53 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-21 5:18 [PATCH net-next 0/2] net: dsa: mxl862xx: MDIO bus integrity and optimization Daniel Golle
2026-03-21 5:19 ` [PATCH net-next 1/2] net: dsa: mxl862xx: add CRC for MDIO communication Daniel Golle
2026-03-21 15:24 ` Andrew Lunn
2026-03-21 15:53 ` Daniel Golle [this message]
2026-03-21 19:29 ` Andrew Lunn
2026-03-21 19:51 ` Daniel Golle
2026-03-21 5:20 ` [PATCH net-next 2/2] net: dsa: mxl862xx: use RST_DATA to skip writing zero words Daniel Golle
2026-03-21 15:28 ` Andrew Lunn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ab6-5XnC6Z3TdytJ@makrotopia.org \
--to=daniel@makrotopia.org \
--cc=ajayaraman@maxlinear.com \
--cc=andrew@lunn.ch \
--cc=cezary.wilmanski@adtran.com \
--cc=chad@monroe.io \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=frankwu@gmx.de \
--cc=john@phrozen.org \
--cc=jverdu@maxlinear.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lxu@maxlinear.com \
--cc=netdev@vger.kernel.org \
--cc=olteanv@gmail.com \
--cc=pabeni@redhat.com \
--cc=yweng@maxlinear.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox