From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from pidgin.makrotopia.org (pidgin.makrotopia.org [185.142.180.65]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4A66E17A586; Sat, 21 Mar 2026 19:51:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.142.180.65 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774122674; cv=none; b=o9urhrolMJFKCogfV06GOIoY7XCtNKFX3pWmv/xoZ/zoyMLr7PLzBa8KA63+POaQY4vmqqZ2MiXFI6ug28GUEEqWxVEu7omZsTdToz2q0260QalrstgdbM9RSG57iHctj2FXxSrNasVdfEDx0u68eTpZCAfF6sq2lfIqVQRn7Vo= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774122674; c=relaxed/simple; bh=gzsj3c+gVrsMfK/04FnnaKeoX2trnctuidWuHjdeU+E=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=sI4KmmJ9ay0L9t2wn/E9s5oh35PX+OW3EC9z2DnQNG8gcPLChBK9IJ5SuWTTj6rXdUJjzVvA2skuuTSGWaAwdsbP6h+2CNp9WM6U1oz8T5XOxGFcveB14AZG3J751su9F89XfZ0WqkiinoQUvq/yXisTqwnna8NDOF0L4Q1q3Xs= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=makrotopia.org; spf=pass smtp.mailfrom=makrotopia.org; arc=none smtp.client-ip=185.142.180.65 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=makrotopia.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=makrotopia.org Received: from local by pidgin.makrotopia.org with esmtpsa (TLS1.3:TLS_AES_256_GCM_SHA384:256) (Exim 4.99) (envelope-from ) id 1w42Li-000000000mW-2hvM; Sat, 21 Mar 2026 19:51:06 +0000 Date: Sat, 21 Mar 2026 19:51:01 +0000 From: Daniel Golle To: Andrew Lunn Cc: Vladimir Oltean , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Frank Wunderlich , Chad Monroe , Cezary Wilmanski , Liang Xu , "Benny (Ying-Tsan) Weng" , Jose Maria Verdu Munoz , Avinash Jayaraman , John Crispin Subject: Re: [PATCH net-next 1/2] net: dsa: mxl862xx: add CRC for MDIO communication Message-ID: References: Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: On Sat, Mar 21, 2026 at 08:29:20PM +0100, Andrew Lunn wrote: > > > This is the question, what to do when you see a checksum failure? The > > > basic assumption is MDIO is reliable. PHYs don't do any sort of > > > checksums, nor any other switches using MDIO. Why would you want > > > checksums? > > > > > > To detect the hardware is broken? If so, is returning EIO sufficient? > > > Would it not be better to admin down all the interfaces? > > > > > > To allow the MDIO clock to run at a higher frequency, at the limit of > > > the bus, so you get occasionally failures? If so, should you not > > > retry? But are the commands idempotent? Can you safely retry? > > > > > > > Your guesses are all correct, and your concerns are justified. > > > > Let me explain the whole picture: > > The switch driver transfers many rather large data structures over > > MDIO, and lacking support for interrupts (firmware doesn't support), > > this is often even interleaved with polling 8 PHYs and at least one > > PCS. To not have things get very slow (and then even sometimes see > > -ETIMEDOUT breaking the PHY state machine if timing is unlucky), it > > is common to overclock the MDIO bus to 20 MHz instead of the 2.5 MHz > > default (the switch claims to support up to 25 MHz, but 20 Mhz is > > sufficient and conservative enough). > > > > That, combined with higher temperature of the switch IC (but still > > within the spec'ed range), can lead to bit-errors -- which, in case > > they remain unnoticed can introduce subtle (security relevant) issues > > such as bridging ports which should not be bridged or flooding to a > > port on which flooding should be disabled. > > O.K. "Interesting" design. 🤐 > > You could solve the PHY timeout issue by claiming to support PHY > interrupts, doing the polling in the DSA driver, and raise an > interrupt if the conditions are met. The mv88e6xxx driver does > something like this. It has an interrupt controller which the PHYs are > connect to. Some designs have the switch interrupt output connected to > a GPIO and so can do real interrupts. Some don't. Rather than have all > the internal PHYs polled one per second by phylib, the mv88e6xxx polls > the interrupt status register every 1/5 of a second and raises the > interrupts instead. Bot faster, and less MDIO transfers. Sadly, the firmware doesn't provide any API for that. The *hardware* would likely be capable, even has an external pin to signal interrupts to the host. But no firmware API to setup/mask interrupts, nor to read or clear the interrupt status. Raw access to the relevant hardware registers representing the interrupt controller also isn't an option, as the only documented raw register access interface only covers the switch engine itself, and another (undocumented) more broad debugging register access API only allows write operations on a very limited range of registers (and the firmware applies bitmasks for each of them individually) basically covering the GPIO and LED controller parts... Plus, most likely interrupts of the built-in PHYs are as broken as for all other GPY2xx PHYs (see PHY driver comments), because it's more or less the same IP. tl;dr: Due to firmware and hardware limitation that's not an option. > I assume the vendor driver does not attempt a retry? The vendor driver doesn't retry, it merely reports CRC errors. I thought about it, and retrying would be an option, as on pure WRITE operations the host could retry sending (as the original request is dropped by the firmware in case of a CRC error), and READ operations could only repeat the data retreaval part, but not the actual command itself. However, I don't think any of that is worth the added complexity. If the we end up with CRC errors, something is very wrong, most likely overheat, so trying to forcefully continue normal operation imho isn't even a good idea even if somehow possible. > > > Setting all interfaces to admin-down is probably the best compromise > > in a case like this, as it would also reduce the energy consumption > > and hence heat emission of the IC (as all built-in PHYs are then down; > > that's where most of the heat comes from) and prevent damage -- I've > > only observed CRC errors with the heatsink removed and artifically > > overheating the IC... > > So in the normal use cases you don't expect CRC errors. That seems > like it should driver the design. Consider any CRCs as fatal and > shutdown. Ack, I also think that's the most reasonable thing to do. As the shutdown itself can trigger more CRC errors (PHYs being put into PDOWN, for example, will cause dozends of MDIO operations to indirectly access the internal MDIO bus via the firmware interface...) I will guard this using an atomic flag to prevent the ongoing shutdown from triggering another shutdown...