* [PATCH net] net: phy: Manual remove LEDs to ensure correct ordering
@ 2023-06-17 15:55 Andrew Lunn
2023-06-18 16:40 ` patchwork-bot+netdevbpf
2023-06-21 14:04 ` Florian Fainelli
0 siblings, 2 replies; 7+ messages in thread
From: Andrew Lunn @ 2023-06-17 15:55 UTC (permalink / raw)
To: netdev
Cc: Heiner Kallweit, ansuelsmth, Russell King, Andrew Lunn, stable,
Florian Fainelli
If the core is left to remove the LEDs via devm_, it is performed too
late, after the PHY driver is removed from the PHY. This results in
dereferencing a NULL pointer when the LED core tries to turn the LED
off before destroying the LED.
Manually unregister the LEDs at a safe point in phy_remove.
Cc: stable@vger.kernel.org
Reported-by: Florian Fainelli <f.fainelli@gmail.com>
Suggested-by: Florian Fainelli <f.fainelli@gmail.com>
Fixes: 01e5b728e9e4 ("net: phy: Add a binding for PHY LEDs")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
---
drivers/net/phy/phy_device.c | 15 ++++++++++++++-
1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 17d0d0555a79..53598210be6c 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -3021,6 +3021,15 @@ static int phy_led_blink_set(struct led_classdev *led_cdev,
return err;
}
+static void phy_leds_unregister(struct phy_device *phydev)
+{
+ struct phy_led *phyled;
+
+ list_for_each_entry(phyled, &phydev->leds, list) {
+ led_classdev_unregister(&phyled->led_cdev);
+ }
+}
+
static int of_phy_led(struct phy_device *phydev,
struct device_node *led)
{
@@ -3054,7 +3063,7 @@ static int of_phy_led(struct phy_device *phydev,
init_data.fwnode = of_fwnode_handle(led);
init_data.devname_mandatory = true;
- err = devm_led_classdev_register_ext(dev, cdev, &init_data);
+ err = led_classdev_register_ext(dev, cdev, &init_data);
if (err)
return err;
@@ -3083,6 +3092,7 @@ static int of_phy_leds(struct phy_device *phydev)
err = of_phy_led(phydev, led);
if (err) {
of_node_put(led);
+ phy_leds_unregister(phydev);
return err;
}
}
@@ -3305,6 +3315,9 @@ static int phy_remove(struct device *dev)
cancel_delayed_work_sync(&phydev->state_queue);
+ if (IS_ENABLED(CONFIG_PHYLIB_LEDS))
+ phy_leds_unregister(phydev);
+
phydev->state = PHY_DOWN;
sfp_bus_del_upstream(phydev->sfp_bus);
--
2.40.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH net] net: phy: Manual remove LEDs to ensure correct ordering
2023-06-17 15:55 [PATCH net] net: phy: Manual remove LEDs to ensure correct ordering Andrew Lunn
@ 2023-06-18 16:40 ` patchwork-bot+netdevbpf
2023-06-21 14:04 ` Florian Fainelli
1 sibling, 0 replies; 7+ messages in thread
From: patchwork-bot+netdevbpf @ 2023-06-18 16:40 UTC (permalink / raw)
To: Andrew Lunn
Cc: netdev, hkallweit1, ansuelsmth, rmk+kernel, stable, f.fainelli
Hello:
This patch was applied to netdev/net.git (main)
by David S. Miller <davem@davemloft.net>:
On Sat, 17 Jun 2023 17:55:00 +0200 you wrote:
> If the core is left to remove the LEDs via devm_, it is performed too
> late, after the PHY driver is removed from the PHY. This results in
> dereferencing a NULL pointer when the LED core tries to turn the LED
> off before destroying the LED.
>
> Manually unregister the LEDs at a safe point in phy_remove.
>
> [...]
Here is the summary with links:
- [net] net: phy: Manual remove LEDs to ensure correct ordering
https://git.kernel.org/netdev/net/c/c938ab4da0eb
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net] net: phy: Manual remove LEDs to ensure correct ordering
2023-06-17 15:55 [PATCH net] net: phy: Manual remove LEDs to ensure correct ordering Andrew Lunn
2023-06-18 16:40 ` patchwork-bot+netdevbpf
@ 2023-06-21 14:04 ` Florian Fainelli
2023-06-21 15:01 ` Andrew Lunn
2023-06-21 17:04 ` Russell King (Oracle)
1 sibling, 2 replies; 7+ messages in thread
From: Florian Fainelli @ 2023-06-21 14:04 UTC (permalink / raw)
To: Andrew Lunn, netdev; +Cc: Heiner Kallweit, ansuelsmth, Russell King, stable
Hi Andrew,
On 6/17/2023 4:55 PM, Andrew Lunn wrote:
> If the core is left to remove the LEDs via devm_, it is performed too
> late, after the PHY driver is removed from the PHY. This results in
> dereferencing a NULL pointer when the LED core tries to turn the LED
> off before destroying the LED.
>
> Manually unregister the LEDs at a safe point in phy_remove.
>
> Cc: stable@vger.kernel.org
> Reported-by: Florian Fainelli <f.fainelli@gmail.com>
> Suggested-by: Florian Fainelli <f.fainelli@gmail.com>
> Fixes: 01e5b728e9e4 ("net: phy: Add a binding for PHY LEDs")
> Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Thanks for fixing this, this is an improvement, though I can still hit
another sort of use after free whereby the GENET driver removes the
mdio-bcm-unimac platform device and eventually cuts the clock to the
MDIO block thus causing the following:
# reboot -f
[ 18.162000] bcmgenet 8f00000.ethernet eth0: Link is Down
[ 18.305163] SError Interrupt on CPU2, code 0x00000000bf000002 -- SError
[ 18.305170] GISB: target abort at 0x8f00e14 [R ], core: cpu_0
[ 18.305180] CPU: 2 PID: 41 Comm: kworker/2:1 Not tainted
6.4.0-rc5-next-20230607-gc7a93fa22690 #98
[ 18.305187] Hardware name: BCM972180HB_V20 (DT)
[ 18.305191] Workqueue: events set_brightness_delayed
[ 18.305214] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS
BTYPE=--)
[ 18.305220] pc : el1_abort+0x30/0x5c
[ 18.305230] lr : el1_abort+0x24/0x5c
[ 18.305235] sp : ffffffc082b73a90
[ 18.305236] x29: ffffffc082b73a90 x28: ffffff8002fad780 x27:
0000000000000000
[ 18.305243] x26: 0000000000000000 x25: 0000000000000000 x24:
ffffff807dbb340d
[ 18.305250] x23: 0000000060000005 x22: ffffffc08066d9ac x21:
0000000096000210
[ 18.305256] x20: ffffffc082b55e14 x19: ffffffc082b73ad0 x18:
0000000000000000
[ 18.305263] x17: 74656e2f74656e72 x16: 656874652e303030 x15:
303066382f626472
[ 18.305269] x14: ffffff8004a84cd8 x13: 6e69622f7273752f x12:
0000000000000000
[ 18.305275] x11: ffffff8002d1c710 x10: 0000000000000870 x9 :
ffffffc080667e34
[ 18.305282] x8 : ffffff8003d44a80 x7 : fefefefefefefeff x6 :
000073746e657665
[ 18.305288] x5 : ffffff8003d44a80 x4 : ffffffc082b73ad0 x3 :
0000000000000025
[ 18.305294] x2 : 000000000000001c x1 : 0000000004208060 x0 :
0000000000000000
[ 18.305303] Kernel panic - not syncing: Asynchronous SError Interrupt
[ 18.305306] CPU: 2 PID: 41 Comm: kworker/2:1 Not tainted
6.4.0-rc5-next-20230607-gc7a93fa22690 #98
[ 18.305311] Hardware name: BCM972180HB_V20 (DT)
[ 18.305314] Workqueue: events set_brightness_delayed
[ 18.305319] Call trace:
[ 18.305321] dump_backtrace+0xdc/0x114
[ 18.305329] show_stack+0x1c/0x28
[ 18.305333] dump_stack_lvl+0x44/0x58
[ 18.305339] dump_stack+0x14/0x1c
[ 18.305342] panic+0x128/0x2f8
[ 18.305350] nmi_panic+0x50/0x70
[ 18.305356] arm64_serror_panic+0x74/0x80
[ 18.305361] do_serror+0x2c/0x5c
[ 18.305366] el1h_64_error_handler+0x30/0x44
[ 18.305372] el1h_64_error+0x64/0x68
[ 18.305378] el1_abort+0x30/0x5c
[ 18.305383] el1h_64_sync_handler+0x64/0xc8
[ 18.305389] el1h_64_sync+0x64/0x68
[ 18.305392] readl_relaxed+0x0/0x8
[ 18.305401] __mdiobus_write+0x3c/0x94
[ 18.305409] mdiobus_write+0x4c/0x70
[ 18.305415] phy_write+0x1c/0x24
[ 18.305419] bcm_phy_read_shadow+0x24/0x40
[ 18.305423] bcm_phy_led_brightness_set+0x40/0x94
[ 18.305428] phy_led_set_brightness+0x48/0x68
[ 18.305434] set_brightness_delayed_set_brightness+0x44/0x7c
[ 18.305443] set_brightness_delayed+0xc4/0x1a4
[ 18.305447] process_one_work+0x1c0/0x284
[ 18.305455] process_scheduled_works+0x44/0x48
[ 18.305461] worker_thread+0x1e8/0x264
[ 18.305467] kthread+0xcc/0xdc
[ 18.305474] ret_from_fork+0x10/0x20
[ 18.311812] Kernel Offset: disabled
[ 18.311814] CPU features: 0x00000003,00010000,0000420b
[ 18.311818] Memory Limit: none
[ 18.566507] ---[ end Kernel panic - not syncing: Asynchronous SError
Interrupt ]---
still not clear to me how the workqueue managed to execute and not
finish before we unregistered the PHY device.
--
Florian
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net] net: phy: Manual remove LEDs to ensure correct ordering
2023-06-21 14:04 ` Florian Fainelli
@ 2023-06-21 15:01 ` Andrew Lunn
2023-06-21 17:04 ` Russell King (Oracle)
1 sibling, 0 replies; 7+ messages in thread
From: Andrew Lunn @ 2023-06-21 15:01 UTC (permalink / raw)
To: Florian Fainelli
Cc: netdev, Heiner Kallweit, ansuelsmth, Russell King, stable
> Thanks for fixing this, this is an improvement, though I can still hit
> another sort of use after free whereby the GENET driver removes the
> mdio-bcm-unimac platform device and eventually cuts the clock to the MDIO
> block thus causing the following:
Hi Florian
Yes, i was not expecting this patch to fix that. But i was getting the
NULL pointer dereference you pointed out with another setup, and this
change does fix that part of the problem.
> still not clear to me how the workqueue managed to execute and not finish
> before we unregistered the PHY device.
Me neither. I took a look at the MDIO bus driver and could not see
anything obvious. I think you are going to have to scatter printk() in
the code to get a clear understanding of the order things are done.
Maybe it is another devm_ timing issue.
Andrew
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net] net: phy: Manual remove LEDs to ensure correct ordering
2023-06-21 14:04 ` Florian Fainelli
2023-06-21 15:01 ` Andrew Lunn
@ 2023-06-21 17:04 ` Russell King (Oracle)
2023-06-21 17:12 ` Florian Fainelli
1 sibling, 1 reply; 7+ messages in thread
From: Russell King (Oracle) @ 2023-06-21 17:04 UTC (permalink / raw)
To: Florian Fainelli; +Cc: Andrew Lunn, netdev, Heiner Kallweit, ansuelsmth, stable
On Wed, Jun 21, 2023 at 03:04:14PM +0100, Florian Fainelli wrote:
> Hi Andrew,
>
> On 6/17/2023 4:55 PM, Andrew Lunn wrote:
> > If the core is left to remove the LEDs via devm_, it is performed too
> > late, after the PHY driver is removed from the PHY. This results in
> > dereferencing a NULL pointer when the LED core tries to turn the LED
> > off before destroying the LED.
> >
> > Manually unregister the LEDs at a safe point in phy_remove.
> >
> > Cc: stable@vger.kernel.org
> > Reported-by: Florian Fainelli <f.fainelli@gmail.com>
> > Suggested-by: Florian Fainelli <f.fainelli@gmail.com>
> > Fixes: 01e5b728e9e4 ("net: phy: Add a binding for PHY LEDs")
> > Signed-off-by: Andrew Lunn <andrew@lunn.ch>
>
> Thanks for fixing this, this is an improvement, though I can still hit
> another sort of use after free whereby the GENET driver removes the
> mdio-bcm-unimac platform device and eventually cuts the clock to the MDIO
> block thus causing the following:
Hi Florian,
Can you try setting trigger_data->led_cdev to NULL after the
cancel_delayed_work_sync() in netdev_trig_deactivate() and see
what the effect is?
Thanks.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net] net: phy: Manual remove LEDs to ensure correct ordering
2023-06-21 17:04 ` Russell King (Oracle)
@ 2023-06-21 17:12 ` Florian Fainelli
2023-06-21 17:52 ` Russell King (Oracle)
0 siblings, 1 reply; 7+ messages in thread
From: Florian Fainelli @ 2023-06-21 17:12 UTC (permalink / raw)
To: Russell King (Oracle)
Cc: Andrew Lunn, netdev, Heiner Kallweit, ansuelsmth, stable
Hi Russell,
On 6/21/2023 6:04 PM, Russell King (Oracle) wrote:
> On Wed, Jun 21, 2023 at 03:04:14PM +0100, Florian Fainelli wrote:
>> Hi Andrew,
>>
>> On 6/17/2023 4:55 PM, Andrew Lunn wrote:
>>> If the core is left to remove the LEDs via devm_, it is performed too
>>> late, after the PHY driver is removed from the PHY. This results in
>>> dereferencing a NULL pointer when the LED core tries to turn the LED
>>> off before destroying the LED.
>>>
>>> Manually unregister the LEDs at a safe point in phy_remove.
>>>
>>> Cc: stable@vger.kernel.org
>>> Reported-by: Florian Fainelli <f.fainelli@gmail.com>
>>> Suggested-by: Florian Fainelli <f.fainelli@gmail.com>
>>> Fixes: 01e5b728e9e4 ("net: phy: Add a binding for PHY LEDs")
>>> Signed-off-by: Andrew Lunn <andrew@lunn.ch>
>>
>> Thanks for fixing this, this is an improvement, though I can still hit
>> another sort of use after free whereby the GENET driver removes the
>> mdio-bcm-unimac platform device and eventually cuts the clock to the MDIO
>> block thus causing the following:
>
> Hi Florian,
>
> Can you try setting trigger_data->led_cdev to NULL after the
> cancel_delayed_work_sync() in netdev_trig_deactivate() and see
> what the effect is?
Thanks for the suggestion, getting an identical trace as before with
that change.
--
Florian
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net] net: phy: Manual remove LEDs to ensure correct ordering
2023-06-21 17:12 ` Florian Fainelli
@ 2023-06-21 17:52 ` Russell King (Oracle)
0 siblings, 0 replies; 7+ messages in thread
From: Russell King (Oracle) @ 2023-06-21 17:52 UTC (permalink / raw)
To: Florian Fainelli; +Cc: Andrew Lunn, netdev, Heiner Kallweit, ansuelsmth, stable
On Wed, Jun 21, 2023 at 06:12:40PM +0100, Florian Fainelli wrote:
> Hi Russell,
>
> On 6/21/2023 6:04 PM, Russell King (Oracle) wrote:
> > On Wed, Jun 21, 2023 at 03:04:14PM +0100, Florian Fainelli wrote:
> > > Hi Andrew,
> > >
> > > On 6/17/2023 4:55 PM, Andrew Lunn wrote:
> > > > If the core is left to remove the LEDs via devm_, it is performed too
> > > > late, after the PHY driver is removed from the PHY. This results in
> > > > dereferencing a NULL pointer when the LED core tries to turn the LED
> > > > off before destroying the LED.
> > > >
> > > > Manually unregister the LEDs at a safe point in phy_remove.
> > > >
> > > > Cc: stable@vger.kernel.org
> > > > Reported-by: Florian Fainelli <f.fainelli@gmail.com>
> > > > Suggested-by: Florian Fainelli <f.fainelli@gmail.com>
> > > > Fixes: 01e5b728e9e4 ("net: phy: Add a binding for PHY LEDs")
> > > > Signed-off-by: Andrew Lunn <andrew@lunn.ch>
> > >
> > > Thanks for fixing this, this is an improvement, though I can still hit
> > > another sort of use after free whereby the GENET driver removes the
> > > mdio-bcm-unimac platform device and eventually cuts the clock to the MDIO
> > > block thus causing the following:
> >
> > Hi Florian,
> >
> > Can you try setting trigger_data->led_cdev to NULL after the
> > cancel_delayed_work_sync() in netdev_trig_deactivate() and see
> > what the effect is?
>
> Thanks for the suggestion, getting an identical trace as before with that
> change.
Thanks for trying. I was wondering whether the work was being re-queued
after the flush_work(), but seemingly not.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2023-06-21 17:55 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-17 15:55 [PATCH net] net: phy: Manual remove LEDs to ensure correct ordering Andrew Lunn
2023-06-18 16:40 ` patchwork-bot+netdevbpf
2023-06-21 14:04 ` Florian Fainelli
2023-06-21 15:01 ` Andrew Lunn
2023-06-21 17:04 ` Russell King (Oracle)
2023-06-21 17:12 ` Florian Fainelli
2023-06-21 17:52 ` Russell King (Oracle)
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).