From: "Russell King (Oracle)" <linux@armlinux.org.uk>
To: Alexander Duyck <alexander.duyck@gmail.com>
Cc: netdev@vger.kernel.org, andrew@lunn.ch, hkallweit1@gmail.com,
davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com
Subject: Re: [net-next PATCH 2/2] net: phylink: Fix issues with link balancing w/ BMC present
Date: Wed, 16 Apr 2025 21:05:15 +0100 [thread overview]
Message-ID: <aAANe/qMWIRY2K5l@shell.armlinux.org.uk> (raw)
In-Reply-To: <aAACyQ494eO4LFQD@shell.armlinux.org.uk>
On Wed, Apr 16, 2025 at 08:19:38PM +0100, Russell King (Oracle) wrote:
> So, when a Linux network driver probes, it starts out in administrative
> state *DOWN*. When the administrator configures the network driver,
> .ndo_open is called, which is expected to configure the network adapter.
>
> Part of that process is to call phylink_start() as one of the last
> steps, which detects whether the link is up or not. If the link is up,
> then phylink will call netif_carrier_on() and .mac_link_up(). This
> tells the core networking layers that the network interface is now
> ready to start sending packets, and it will begin queueing packets for
> the network driver to process - not before.
>
> Prior to .ndo_open being called, the networking layers do not expect
> traffic from the network device no matter what physical state the
> media link is in. If .ndo_open fails, the same applies - no traffic is
> expected to be passed to the core network layers from the network
> layers because as far as the network stack is concerned, the interface
> is still administratively down.
>
> Thus, the fact that your BMC thinks that the link is up is irrelevant.
>
> So, start off in a state that packet activity is suspended even if the
> link is up at probe time. Only start packet activity (reception and
> transmission) once .mac_link_up() has been called. Stop that activity
> when .mac_link_down() is subsequently called.
>
> There have been lots of NICs out there where the physical link doesn't
> follow the adminstrative state of the network interface. This is not a
> problem. It may be desirable that it does, but a desire is not the same
> as "it must".
Let me be crystal clear on this.
Phylink has a contract with all existing users. That contract is:
Initial state: link down.
Driver calls phylink_start() in its .ndo_open method.
Phylink does configuration of the PHY and link according to the
chosen link parameters by calling into the MAC, PCS, and phylib as
appropriate.
If the link is then discovered to be up (it might have been already
up before phylink_start() was called), phylink will call the various
components such as PCS and MAC to inform them that the link is now up.
This will mean calling the .mac_link_up() method. Otherwise (if the
link is discovered to be down when the interface is brought up) no
call to either .mac_link_up() nor .mac_link_down() will be made.
If the link _subsequently_ goes down, then phylink deals with that
and calls .mac_link_down() - only if .mac_link_up() was previously
called (that's one of the bugs you discovered, that on resume it
gets called anyway. I've submitted a fix for that contract breach,
which only affects a very small number of drivers - stmmac, ucc_geth
and your fbnic out of 22 total ethernet users plus however many DSA
users we have.)
Only if .mac_link_down() has been called, if the link subsequently
comes back up, then the same process happens as before resulting in
.mac_link_up() being called.
If the interface is taken down, then .mac_link_down() will be called
if and only if .mac_link_up() had been called.
The ordering of .mac_link_up() / .mac_link_down() is a strict
contract term with phylink users.
The reason for this contract: phylink users may have ordering
requirements.
For example, on mac_link_down(), they may wait for packet activity to
stop, and then place the MAC in reset. If called without a previous
.mac_link_up call, the wait stage may time out due to the MAC being
in reset. (Ocelot may suffer with this.)
Another example is fs_enet which also relies on this strict ordering
as described above.
There could be others - there are some that call into firmware on
calls to .mac_link_up() / .mac_link_down() and misordering those
depends on what the firmware is doing, which we have no visibility
of.
As I stated, this is the contract that phylink gave to users, and
the contract still stands, and can't be broken to behave differently
(e.g. calling .mac_link_down() after phylink_start() without an
intervening call to .mac_link_up()) otherwise existing users will
break. Bugs that go against that contract will be fixed, but the
contract will not be intentionally broken.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
next prev parent reply other threads:[~2025-04-16 20:05 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-16 15:28 [net-next PATCH 0/2] net: phylink: Fix issue w/ BMC link flap Alexander Duyck
2025-04-16 15:28 ` [net-next PATCH 1/2] net: phylink: Drop unused defines for SUPPORTED/ADVERTISED_INTERFACES Alexander Duyck
2025-04-22 8:32 ` Russell King (Oracle)
2025-04-16 15:29 ` [net-next PATCH 2/2] net: phylink: Fix issues with link balancing w/ BMC present Alexander Duyck
2025-04-16 16:01 ` Russell King (Oracle)
2025-04-16 16:16 ` [PATCH net] net: phylink: fix suspend/resume with WoL enabled and link down Russell King (Oracle)
2025-04-16 17:14 ` Russell King (Oracle)
2025-04-17 14:30 ` Alexander H Duyck
2025-04-17 14:35 ` Russell King (Oracle)
2025-04-17 15:23 ` Russell King (Oracle)
2025-04-17 17:06 ` Alexander Duyck
2025-04-17 17:27 ` Russell King (Oracle)
2025-04-17 19:49 ` Alexander Duyck
2025-04-22 9:51 ` Russell King (Oracle)
2025-04-22 15:30 ` Alexander Duyck
2025-04-22 16:00 ` Andrew Lunn
2025-04-22 20:09 ` Alexander Duyck
2025-04-23 0:00 ` patchwork-bot+netdevbpf
2025-04-16 17:12 ` [net-next PATCH 2/2] net: phylink: Fix issues with link balancing w/ BMC present Russell King (Oracle)
2025-04-16 19:03 ` Alexander Duyck
2025-04-16 19:19 ` Russell King (Oracle)
2025-04-16 20:05 ` Russell King (Oracle) [this message]
2025-04-16 22:58 ` Alexander Duyck
2025-04-19 18:11 ` [net-next PATCH 0/2] net: phylink: Fix issue w/ BMC link flap Andrew Lunn
2025-04-20 18:18 ` Alexander Duyck
2025-04-20 21:58 ` Andrew Lunn
2025-04-21 15:51 ` Alexander Duyck
2025-04-21 16:50 ` Alexander Duyck
2025-04-22 1:21 ` Jakub Kicinski
2025-04-22 13:49 ` Andrew Lunn
2025-04-22 15:28 ` Jakub Kicinski
2025-04-22 16:49 ` Andrew Lunn
2025-04-22 17:30 ` Russell King (Oracle)
2025-04-22 18:13 ` Andrew Lunn
2025-04-22 18:50 ` Russell King (Oracle)
2025-04-22 23:51 ` Jakub Kicinski
2025-04-22 21:29 ` Alexander Duyck
2025-04-22 22:26 ` Andrew Lunn
2025-04-22 23:06 ` Alexander Duyck
2025-04-23 18:38 ` Alexander Duyck
2025-04-24 20:34 ` Andrew Lunn
2025-04-24 23:40 ` Alexander Duyck
2025-04-25 13:11 ` Andrew Lunn
2025-04-25 15:41 ` Alexander Duyck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aAANe/qMWIRY2K5l@shell.armlinux.org.uk \
--to=linux@armlinux.org.uk \
--cc=alexander.duyck@gmail.com \
--cc=andrew@lunn.ch \
--cc=davem@davemloft.net \
--cc=hkallweit1@gmail.com \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).