From: Vladimir Oltean <olteanv@gmail.com>
To: Heiner Kallweit <hkallweit1@gmail.com>
Cc: Ioana Ciornei <ciorneiioana@gmail.com>,
Andrew Lunn <andrew@lunn.ch>,
Russell King <linux@armlinux.org.uk>,
Jakub Kicinski <kuba@kernel.org>,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
Ioana Ciornei <ioana.ciornei@nxp.com>,
Alexandru Ardelean <alexandru.ardelean@analog.com>,
Andre Edich <andre.edich@microchip.com>,
Antoine Tenart <atenart@kernel.org>,
Baruch Siach <baruch@tkos.co.il>,
Christophe Leroy <christophe.leroy@c-s.fr>,
Dan Murphy <dmurphy@ti.com>,
Divya Koppera <Divya.Koppera@microchip.com>,
Florian Fainelli <f.fainelli@gmail.com>,
Hauke Mehrtens <hauke@hauke-m.de>,
Jerome Brunet <jbrunet@baylibre.com>,
Kavya Sree Kotagiri <kavyasree.kotagiri@microchip.com>,
Linus Walleij <linus.walleij@linaro.org>,
Marco Felsch <m.felsch@pengutronix.de>,
Marek Vasut <marex@denx.de>,
Martin Blumenstingl <martin.blumenstingl@googlemail.com>,
Mathias Kresin <dev@kresin.me>,
Maxim Kochetkov <fido_max@inbox.ru>,
Michael Walle <michael@walle.cc>,
Neil Armstrong <narmstrong@baylibre.com>,
Nisar Sayed <Nisar.Sayed@microchip.com>,
Oleksij Rempel <o.rempel@pengutronix.de>,
Philippe Schenker <philippe.schenker@toradex.com>,
Willy Liu <willy.liu@realtek.com>,
Yuiko Oshino <yuiko.oshino@microchip.com>
Subject: Re: [PATCH net-next 00/19] net: phy: add support for shared interrupts (part 1)
Date: Sat, 31 Oct 2020 00:46:33 +0200 [thread overview]
Message-ID: <20201030224633.wxvkt7p7pb2kfbuk@skbuf> (raw)
In-Reply-To: <20201030220642.ctkt2pitdvri3byt@skbuf>
On Sat, Oct 31, 2020 at 12:06:42AM +0200, Vladimir Oltean wrote:
> On Fri, Oct 30, 2020 at 10:56:24PM +0100, Heiner Kallweit wrote:
> > I'd just like to avoid the term "shared interrupt", because it has
> > a well-defined meaning. Our major concern isn't shared interrupts
> > but support for multiple interrupt sources (in addition to
> > link change) in a PHY.
>
> You may be a little bit confused Heiner.
> This series adds support for exactly _that_ meaning of shared interrupts.
> Shared interrupts (aka wired-OR on the PCB) don't work today with the
> PHY library. I have a board that won't even boot to prompt when the
> interrupt lines of its 2 PHYs are enabled, that this series fixes.
> You might need to take another look through the commit messages I'm afraid.
Maybe this diagram will help you visualize better.
time
|
| PHY 1 PHY 2 has pending IRQ
| | (e.g. link up)
| v |
| phy_interrupt() |
| called for PHY 1 |
| | |
| v |
| returns IRQ_HANDLED via |
| phy_clear_interrupt() |
| | |
| | |
| v |
| handling of shared IRQ |
| ends here |
| | |
| | v
| | PHY 2 still has pending IRQ
| | because, you know, it wasn't actually
| | serviced
| v |
| phy_interrupt() |
| called for PHY 1 |
| | |
| v |
| returns IRQ_HANDLED via |
| phy_clear_interrupt() |
| | |
| | |
| v |
| handling of shared IRQ |
| ends here |
| | PHY 2: Hey! It's me! Over here!
| | |
| v |
| phy_interrupt() |
| called for PHY 1 |
| | |
| v |
| returns IRQ_HANDLED via |
| phy_clear_interrupt() |
| | |
| | |
| v |
| handling of shared IRQ |
| ends here |
| | PHY 2: Srsly?
| | |
| v |
| phy_interrupt() |
| called for PHY 1 |
| | |
| v |
| ... ...
|
| 21 seconds later
|
| RCU stall
v
This happens because today, the way phy_interrupt() is written, you can
only return IRQ_NONE and give the other driver a chance _if_ your driver
implements .did_interrupt(). But the kernel documentation of
.did_interrupt() only recommends to implement that function if you are a
multi-PHY package driver (otherwise stated, the hardware chip has an
embedded shared IRQ). But as things stand, _everybody_ should implement
.did_interrupt() in order for any combination of PHY drivers to support
shared IRQs.
What Ioana is proposing, and this is something that I fully agree with,
is that we just get rid of the layering where the PHY library tries to
be helpful but instead invites everybody to write systematically bad
code. Anyone knows how to write an IRQ handler with eyes closed, but the
fact that .did_interrupt() is mandatory for proper shared IRQ support is
not obvious to everybody, it seems. So let's just have a dedicated IRQ
handling function per each PHY driver, so that we don't get confused in
this sloppy mess of return values, and the code can actually be
followed.
Even _with_ Ioana's changes, there is one more textbook case of shared
interrupts causing trouble, and that is actually the reason why nobody
likes them except hardware engineers who don't get to deal with this.
time
|
| PHY 1 probed
| (module or built-in)
| | PHY 2 has pending IRQ
| | (it had link up from previous
| v boot, or from bootloader, etc)
| phy_interrupt() |
| called for PHY 1 |
| | |
| v |
| returns IRQ_NONE as |
| it should v
| | PHY 2 still has pending IRQ
| | but its handler wasn't called
| | because its driver has not been
| | yet loaded
| v |
| phy_interrupt() |
| called for PHY 1 |
| | |
| v |
| returns IRQ_NONE as |
| it should v
| | PHY 2: Not again :(
| | |
| v |
| phy_interrupt() |
| called for PHY 1 |
| | |
| v |
| returns IRQ_NONE as |
| it should |
| | |
| ... ...
| | |
| | PHY 2 driver never gets probed
| | either because it's a module or
| | because the system is too busy
| | checking PHY 1 over and over
| | again for an interrupt that
| | it did not trigger
| | |
| ... ...
|
| 21 seconds later
|
| RCU stall
v
The way that it's solved is that it's never 100% solved.
This one you can just avoid, but never recover from it.
To avoid it, you must ensure from your previous boot environments
(bootloader, kexec) that the IRQ line is not pending. Because if you
leave the shared IRQ line pending, the system is kaput if it happens to
probe the drivers in the wrong order (aka don't probe first the driver
that will clear that shared IRQ). It's like Minesweeper, only worse.
That's why the shutdown hook is there, as a best-effort attempt for
Linux to clean up after itself. But we're always at the mercy of the
bootloader, or even at the mercy of chance. If the previous kernel
panicked, there's no orderly cleanup to speak of.
Hope it's clearer now.
next prev parent reply other threads:[~2020-10-30 22:46 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-10-29 10:07 [PATCH net-next 00/19] net: phy: add support for shared interrupts (part 1) Ioana Ciornei
2020-10-29 10:07 ` [PATCH net-next 01/19] net: phy: export phy_error and phy_trigger_machine Ioana Ciornei
2020-10-29 10:07 ` [PATCH net-next 02/19] net: phy: add a shutdown procedure Ioana Ciornei
2020-10-29 10:07 ` [PATCH net-next 03/19] net: phy: make .ack_interrupt() optional Ioana Ciornei
2020-10-29 10:07 ` [PATCH net-next 04/19] net: phy: at803x: implement generic .handle_interrupt() callback Ioana Ciornei
2020-10-29 10:07 ` [PATCH net-next 05/19] net: phy: at803x: remove the use of .ack_interrupt() Ioana Ciornei
2020-10-29 10:07 ` [PATCH net-next 06/19] net: phy: mscc: use phy_trigger_machine() to notify link change Ioana Ciornei
2020-10-29 10:07 ` [PATCH net-next 07/19] net: phy: mscc: implement generic .handle_interrupt() callback Ioana Ciornei
2020-10-29 10:07 ` [PATCH net-next 08/19] net: phy: mscc: remove the use of .ack_interrupt() Ioana Ciornei
2020-10-29 10:07 ` [PATCH net-next 09/19] net: phy: aquantia: implement generic .handle_interrupt() callback Ioana Ciornei
2020-10-29 10:07 ` [PATCH net-next 10/19] net: phy: aquantia: remove the use of .ack_interrupt() Ioana Ciornei
2020-10-29 10:07 ` [PATCH net-next 11/19] net: phy: broadcom: implement generic .handle_interrupt() callback Ioana Ciornei
2020-10-29 10:07 ` [PATCH net-next 12/19] net: phy: broadcom: remove use of ack_interrupt() Ioana Ciornei
2020-10-29 10:07 ` [PATCH net-next 13/19] net: phy: cicada: implement the generic .handle_interrupt() callback Ioana Ciornei
2020-10-29 10:07 ` [PATCH net-next 14/19] net: phy: cicada: remove the use of .ack_interrupt() Ioana Ciornei
2020-10-29 10:07 ` [PATCH net-next 15/19] net: phy: davicom: implement generic .handle_interrupt() calback Ioana Ciornei
2020-10-29 10:07 ` [PATCH net-next 16/19] net: phy: davicom: remove the use of .ack_interrupt() Ioana Ciornei
2020-10-29 10:07 ` [PATCH net-next 17/19] net: phy: add genphy_handle_interrupt_no_ack() Ioana Ciornei
2020-10-29 10:07 ` [PATCH net-next 18/19] net: phy: realtek: implement generic .handle_interrupt() callback Ioana Ciornei
2020-10-29 10:07 ` [PATCH net-next 19/19] net: phy: realtek: remove the use of .ack_interrupt() Ioana Ciornei
2020-10-30 21:56 ` [PATCH net-next 00/19] net: phy: add support for shared interrupts (part 1) Heiner Kallweit
2020-10-30 22:06 ` Vladimir Oltean
2020-10-30 22:33 ` Heiner Kallweit
2020-10-30 22:46 ` Vladimir Oltean [this message]
2020-10-31 5:27 ` Ioana Ciornei
2020-10-30 22:42 ` Heiner Kallweit
2020-10-30 23:36 ` Andrew Lunn
2020-10-31 5:22 ` Ioana Ciornei
2020-10-31 10:18 ` Heiner Kallweit
2020-10-31 10:57 ` Ioana Ciornei
2020-10-31 14:32 ` Andrew Lunn
2020-10-31 14:51 ` Ioana Ciornei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201030224633.wxvkt7p7pb2kfbuk@skbuf \
--to=olteanv@gmail.com \
--cc=Divya.Koppera@microchip.com \
--cc=Nisar.Sayed@microchip.com \
--cc=alexandru.ardelean@analog.com \
--cc=andre.edich@microchip.com \
--cc=andrew@lunn.ch \
--cc=atenart@kernel.org \
--cc=baruch@tkos.co.il \
--cc=christophe.leroy@c-s.fr \
--cc=ciorneiioana@gmail.com \
--cc=dev@kresin.me \
--cc=dmurphy@ti.com \
--cc=f.fainelli@gmail.com \
--cc=fido_max@inbox.ru \
--cc=hauke@hauke-m.de \
--cc=hkallweit1@gmail.com \
--cc=ioana.ciornei@nxp.com \
--cc=jbrunet@baylibre.com \
--cc=kavyasree.kotagiri@microchip.com \
--cc=kuba@kernel.org \
--cc=linus.walleij@linaro.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@armlinux.org.uk \
--cc=m.felsch@pengutronix.de \
--cc=marex@denx.de \
--cc=martin.blumenstingl@googlemail.com \
--cc=michael@walle.cc \
--cc=narmstrong@baylibre.com \
--cc=netdev@vger.kernel.org \
--cc=o.rempel@pengutronix.de \
--cc=philippe.schenker@toradex.com \
--cc=willy.liu@realtek.com \
--cc=yuiko.oshino@microchip.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox