* [PATCH net] net: dsa: lan9303: avoid dsa_switch_shutdown()
@ 2024-09-04 9:07 Alexander Sverdlin
2024-09-11 14:33 ` Jakub Kicinski
0 siblings, 1 reply; 11+ messages in thread
From: Alexander Sverdlin @ 2024-09-04 9:07 UTC (permalink / raw)
To: netdev; +Cc: Andrew Lunn
dsa_switch_shutdown() doesn't bring down any ports, but only disconnects
slaves from master. Packets still come afterwards into master port and the
ports are being polled for link status. This leads to crashes:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
CPU: 0 PID: 442 Comm: kworker/0:3 Tainted: G O 6.1.99+ #1
Workqueue: events_power_efficient phy_state_machine
pc : lan9303_mdio_phy_read
lr : lan9303_phy_read
Call trace:
lan9303_mdio_phy_read
lan9303_phy_read
dsa_slave_phy_read
__mdiobus_read
mdiobus_read
genphy_update_link
genphy_read_status
phy_check_link_status
phy_state_machine
process_one_work
worker_thread
Call lan9303_remove() instead to really unregister all ports before zeroing
drvdata and dsa_ptr.
Fixes: 0650bf52b31f ("net: dsa: be compatible with masters which unregister on shutdown")
Cc: stable@vger.kernel.org
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@siemens.com>
---
drivers/net/dsa/lan9303-core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/dsa/lan9303-core.c b/drivers/net/dsa/lan9303-core.c
index 268949939636..ecd507355f51 100644
--- a/drivers/net/dsa/lan9303-core.c
+++ b/drivers/net/dsa/lan9303-core.c
@@ -1477,7 +1477,7 @@ EXPORT_SYMBOL(lan9303_remove);
void lan9303_shutdown(struct lan9303 *chip)
{
- dsa_switch_shutdown(chip->ds);
+ lan9303_remove(chip);
}
EXPORT_SYMBOL(lan9303_shutdown);
--
2.46.0
^ permalink raw reply related [flat|nested] 11+ messages in thread* Re: [PATCH net] net: dsa: lan9303: avoid dsa_switch_shutdown() 2024-09-04 9:07 [PATCH net] net: dsa: lan9303: avoid dsa_switch_shutdown() Alexander Sverdlin @ 2024-09-11 14:33 ` Jakub Kicinski 0 siblings, 0 replies; 11+ messages in thread From: Jakub Kicinski @ 2024-09-11 14:33 UTC (permalink / raw) To: Alexander Sverdlin; +Cc: netdev, Andrew Lunn On Wed, 4 Sep 2024 11:07:42 +0200 Alexander Sverdlin wrote: > dsa_switch_shutdown() doesn't bring down any ports, but only disconnects > slaves from master. Packets still come afterwards into master port and the > ports are being polled for link status. This leads to crashes: Something is wrong with the date on your (email) system, the email arrived with a date from a week ago. It's not going to get ingested into our patch handling systems. -- pw-bot: cr ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH net] net: dsa: lan9303: avoid dsa_switch_shutdown()
@ 2024-09-11 14:40 A. Sverdlin
2024-09-11 16:28 ` Vladimir Oltean
2024-09-12 10:15 ` Vladimir Oltean
0 siblings, 2 replies; 11+ messages in thread
From: A. Sverdlin @ 2024-09-11 14:40 UTC (permalink / raw)
To: netdev
Cc: Alexander Sverdlin, Andrew Lunn, Florian Fainelli,
Vladimir Oltean, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, stable
From: Alexander Sverdlin <alexander.sverdlin@siemens.com>
dsa_switch_shutdown() doesn't bring down any ports, but only disconnects
slaves from master. Packets still come afterwards into master port and the
ports are being polled for link status. This leads to crashes:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
CPU: 0 PID: 442 Comm: kworker/0:3 Tainted: G O 6.1.99+ #1
Workqueue: events_power_efficient phy_state_machine
pc : lan9303_mdio_phy_read
lr : lan9303_phy_read
Call trace:
lan9303_mdio_phy_read
lan9303_phy_read
dsa_slave_phy_read
__mdiobus_read
mdiobus_read
genphy_update_link
genphy_read_status
phy_check_link_status
phy_state_machine
process_one_work
worker_thread
Call lan9303_remove() instead to really unregister all ports before zeroing
drvdata and dsa_ptr.
Fixes: 0650bf52b31f ("net: dsa: be compatible with masters which unregister on shutdown")
Cc: stable@vger.kernel.org
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@siemens.com>
---
drivers/net/dsa/lan9303-core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/dsa/lan9303-core.c b/drivers/net/dsa/lan9303-core.c
index 268949939636..ecd507355f51 100644
--- a/drivers/net/dsa/lan9303-core.c
+++ b/drivers/net/dsa/lan9303-core.c
@@ -1477,7 +1477,7 @@ EXPORT_SYMBOL(lan9303_remove);
void lan9303_shutdown(struct lan9303 *chip)
{
- dsa_switch_shutdown(chip->ds);
+ lan9303_remove(chip);
}
EXPORT_SYMBOL(lan9303_shutdown);
--
2.46.0
^ permalink raw reply related [flat|nested] 11+ messages in thread* Re: [PATCH net] net: dsa: lan9303: avoid dsa_switch_shutdown() 2024-09-11 14:40 A. Sverdlin @ 2024-09-11 16:28 ` Vladimir Oltean 2024-09-11 16:55 ` Sverdlin, Alexander 2024-09-12 10:15 ` Vladimir Oltean 1 sibling, 1 reply; 11+ messages in thread From: Vladimir Oltean @ 2024-09-11 16:28 UTC (permalink / raw) To: A. Sverdlin Cc: netdev, Andrew Lunn, Florian Fainelli, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, stable Hi Alexander, On Wed, Sep 11, 2024 at 04:40:03PM +0200, A. Sverdlin wrote: > From: Alexander Sverdlin <alexander.sverdlin@siemens.com> > > dsa_switch_shutdown() doesn't bring down any ports, but only disconnects > slaves from master. Packets still come afterwards into master port and the > ports are being polled for link status. This leads to crashes: > > Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 > CPU: 0 PID: 442 Comm: kworker/0:3 Tainted: G O 6.1.99+ #1 > Workqueue: events_power_efficient phy_state_machine > pc : lan9303_mdio_phy_read > lr : lan9303_phy_read > Call trace: > lan9303_mdio_phy_read > lan9303_phy_read > dsa_slave_phy_read > __mdiobus_read > mdiobus_read > genphy_update_link > genphy_read_status > phy_check_link_status > phy_state_machine > process_one_work > worker_thread > > Call lan9303_remove() instead to really unregister all ports before zeroing > drvdata and dsa_ptr. > > Fixes: 0650bf52b31f ("net: dsa: be compatible with masters which unregister on shutdown") > Cc: stable@vger.kernel.org > Signed-off-by: Alexander Sverdlin <alexander.sverdlin@siemens.com> > --- > drivers/net/dsa/lan9303-core.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/net/dsa/lan9303-core.c b/drivers/net/dsa/lan9303-core.c > index 268949939636..ecd507355f51 100644 > --- a/drivers/net/dsa/lan9303-core.c > +++ b/drivers/net/dsa/lan9303-core.c > @@ -1477,7 +1477,7 @@ EXPORT_SYMBOL(lan9303_remove); > > void lan9303_shutdown(struct lan9303 *chip) > { > - dsa_switch_shutdown(chip->ds); > + lan9303_remove(chip); > } > EXPORT_SYMBOL(lan9303_shutdown); > > -- > 2.46.0 > You've said here that a similar change still does not protect against packets received after shutdown: https://lore.kernel.org/netdev/c5e0e67400816d68e6bf90b4a999bfa28c59043b.camel@siemens.com/ The difference between that and this is the extra lan9303_disable_processing_port() calls here. But while that does disable RX on switch ports, it still doesn't wait for pending RX frames to be processed. So the race is still open. No? ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net] net: dsa: lan9303: avoid dsa_switch_shutdown() 2024-09-11 16:28 ` Vladimir Oltean @ 2024-09-11 16:55 ` Sverdlin, Alexander 2024-09-11 17:04 ` Sverdlin, Alexander 0 siblings, 1 reply; 11+ messages in thread From: Sverdlin, Alexander @ 2024-09-11 16:55 UTC (permalink / raw) To: olteanv@gmail.com Cc: andrew@lunn.ch, davem@davemloft.net, f.fainelli@gmail.com, pabeni@redhat.com, kuba@kernel.org, stable@vger.kernel.org, edumazet@google.com, netdev@vger.kernel.org Hello Vladimir! On Wed, 2024-09-11 at 19:28 +0300, Vladimir Oltean wrote: > On Wed, Sep 11, 2024 at 04:40:03PM +0200, A. Sverdlin wrote: > > From: Alexander Sverdlin <alexander.sverdlin@siemens.com> > > > > dsa_switch_shutdown() doesn't bring down any ports, but only disconnects > > slaves from master. Packets still come afterwards into master port and the > > ports are being polled for link status. This leads to crashes: > > > > Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 > > CPU: 0 PID: 442 Comm: kworker/0:3 Tainted: G O 6.1.99+ #1 > > Workqueue: events_power_efficient phy_state_machine > > pc : lan9303_mdio_phy_read > > lr : lan9303_phy_read > > Call trace: > > lan9303_mdio_phy_read > > lan9303_phy_read > > dsa_slave_phy_read > > __mdiobus_read > > mdiobus_read > > genphy_update_link > > genphy_read_status > > phy_check_link_status > > phy_state_machine > > process_one_work > > worker_thread > > > > Call lan9303_remove() instead to really unregister all ports before zeroing > > drvdata and dsa_ptr. > > > > Fixes: 0650bf52b31f ("net: dsa: be compatible with masters which unregister on shutdown") > > Cc: stable@vger.kernel.org > > Signed-off-by: Alexander Sverdlin <alexander.sverdlin@siemens.com> > > --- > > drivers/net/dsa/lan9303-core.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/net/dsa/lan9303-core.c b/drivers/net/dsa/lan9303-core.c > > index 268949939636..ecd507355f51 100644 > > --- a/drivers/net/dsa/lan9303-core.c > > +++ b/drivers/net/dsa/lan9303-core.c > > @@ -1477,7 +1477,7 @@ EXPORT_SYMBOL(lan9303_remove); > > > > void lan9303_shutdown(struct lan9303 *chip) > > { > > - dsa_switch_shutdown(chip->ds); > > + lan9303_remove(chip); > > } > > EXPORT_SYMBOL(lan9303_shutdown); > > > > -- > > 2.46.0 > > > > You've said here that a similar change still does not protect against > packets received after shutdown: > https://lore.kernel.org/netdev/c5e0e67400816d68e6bf90b4a999bfa28c59043b.camel@siemens.com/ > > The difference between that and this is the extra lan9303_disable_processing_port() > calls here. But while that does disable RX on switch ports, it still doesn't wait > for pending RX frames to be processed. So the race is still open. No? This patch addresses the race of zeroing drvdata in static void lan9303_mdio_shutdown(struct mdio_device *mdiodev) { struct lan9303_mdio *sw_dev = dev_get_drvdata(&mdiodev->dev); if (!sw_dev) return; lan9303_shutdown(&sw_dev->chip); dev_set_drvdata(&mdiodev->dev, NULL); } versus static int lan9303_mdio_phy_read(struct lan9303 *chip, int phy, int reg) { struct lan9303_mdio *sw_dev = dev_get_drvdata(chip->dev); what you refer to is another race, zeroing of dsa_ptr in struct net_device versus the whole network stack, which I addressed in https://lore.kernel.org/netdev/20240910130321.337154-2-alexander.sverdlin@siemens.com/ -- Alexander Sverdlin Siemens AG www.siemens.com ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net] net: dsa: lan9303: avoid dsa_switch_shutdown() 2024-09-11 16:55 ` Sverdlin, Alexander @ 2024-09-11 17:04 ` Sverdlin, Alexander 2024-09-11 17:09 ` Sverdlin, Alexander 0 siblings, 1 reply; 11+ messages in thread From: Sverdlin, Alexander @ 2024-09-11 17:04 UTC (permalink / raw) To: olteanv@gmail.com Cc: andrew@lunn.ch, davem@davemloft.net, f.fainelli@gmail.com, pabeni@redhat.com, kuba@kernel.org, stable@vger.kernel.org, edumazet@google.com, netdev@vger.kernel.org Hello Vladimir, On Wed, 2024-09-11 at 18:55 +0200, Alexander Sverdlin wrote: > > You've said here that a similar change still does not protect against > > packets received after shutdown: > > https://lore.kernel.org/netdev/c5e0e67400816d68e6bf90b4a999bfa28c59043b.camel@siemens.com/ > > > > The difference between that and this is the extra lan9303_disable_processing_port() > > calls here. But while that does disable RX on switch ports, it still doesn't wait > > for pending RX frames to be processed. So the race is still open. No? besides from the below, I've expected this question... In the meanwhile I've tested mv88e6xxx driver, but it (accidentally) has no MDIO race vs shutdown. After some shallow review of the drivers I didn't find dev_get_drvdata <= mdio_read pattern therefore I've posted this tested patch. If you'd prefer to solve this centrally for all drivers, I can test your patch from the MDIO-drvdata PoV. > This patch addresses the race of zeroing drvdata in > > static void lan9303_mdio_shutdown(struct mdio_device *mdiodev) > { > struct lan9303_mdio *sw_dev = dev_get_drvdata(&mdiodev->dev); > > if (!sw_dev) > return; > > lan9303_shutdown(&sw_dev->chip); > > dev_set_drvdata(&mdiodev->dev, NULL); > } > > versus > > static int lan9303_mdio_phy_read(struct lan9303 *chip, int phy, int reg) > { > struct lan9303_mdio *sw_dev = dev_get_drvdata(chip->dev); > > what you refer to is another race, zeroing of dsa_ptr in struct net_device versus > the whole network stack, which I addressed in > https://lore.kernel.org/netdev/20240910130321.337154-2-alexander.sverdlin@siemens.com/ -- Alexander Sverdlin Siemens AG www.siemens.com ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net] net: dsa: lan9303: avoid dsa_switch_shutdown() 2024-09-11 17:04 ` Sverdlin, Alexander @ 2024-09-11 17:09 ` Sverdlin, Alexander 2024-09-11 17:19 ` Vladimir Oltean 0 siblings, 1 reply; 11+ messages in thread From: Sverdlin, Alexander @ 2024-09-11 17:09 UTC (permalink / raw) To: olteanv@gmail.com Cc: andrew@lunn.ch, davem@davemloft.net, f.fainelli@gmail.com, pabeni@redhat.com, kuba@kernel.org, stable@vger.kernel.org, edumazet@google.com, netdev@vger.kernel.org On Wed, 2024-09-11 at 19:04 +0200, Alexander Sverdlin wrote: > > > The difference between that and this is the extra lan9303_disable_processing_port() > > > calls here. But while that does disable RX on switch ports, it still doesn't wait > > > for pending RX frames to be processed. So the race is still open. No? > > besides from the below, I've expected this question... In the meanwhile I've tested > mv88e6xxx driver, but it (accidentally) has no MDIO race vs shutdown. > After some shallow review of the drivers I didn't find dev_get_drvdata <= mdio_read > pattern therefore I've posted this tested patch. > > If you'd prefer to solve this centrally for all drivers, I can test your patch from > the MDIO-drvdata PoV. But this would mean throwing away the whole "net: dsa: be compatible with masters which unregister on shutdown" work? -- Alexander Sverdlin Siemens AG www.siemens.com ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net] net: dsa: lan9303: avoid dsa_switch_shutdown() 2024-09-11 17:09 ` Sverdlin, Alexander @ 2024-09-11 17:19 ` Vladimir Oltean 0 siblings, 0 replies; 11+ messages in thread From: Vladimir Oltean @ 2024-09-11 17:19 UTC (permalink / raw) To: Sverdlin, Alexander Cc: andrew@lunn.ch, davem@davemloft.net, f.fainelli@gmail.com, pabeni@redhat.com, kuba@kernel.org, stable@vger.kernel.org, edumazet@google.com, netdev@vger.kernel.org On Wed, Sep 11, 2024 at 05:09:08PM +0000, Sverdlin, Alexander wrote: > On Wed, 2024-09-11 at 19:04 +0200, Alexander Sverdlin wrote: > > > > The difference between that and this is the extra lan9303_disable_processing_port() > > > > calls here. But while that does disable RX on switch ports, it still doesn't wait > > > > for pending RX frames to be processed. So the race is still open. No? > > > > besides from the below, I've expected this question... In the meanwhile I've tested > > mv88e6xxx driver, but it (accidentally) has no MDIO race vs shutdown. > > After some shallow review of the drivers I didn't find dev_get_drvdata <= mdio_read > > pattern therefore I've posted this tested patch. > > > > If you'd prefer to solve this centrally for all drivers, I can test your patch from > > the MDIO-drvdata PoV. > > But this would mean throwing away the whole > "net: dsa: be compatible with masters which unregister on shutdown" work? I did not propose anything (yet). I'm still trying to form a mental model of what is broken and what works. Hence the request to test that change. OTOH, this patch is equally throwing away the whole "net: dsa: be compatible with masters which unregister on shutdown" work (for lan9303). ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net] net: dsa: lan9303: avoid dsa_switch_shutdown() 2024-09-11 14:40 A. Sverdlin 2024-09-11 16:28 ` Vladimir Oltean @ 2024-09-12 10:15 ` Vladimir Oltean 2024-09-13 16:16 ` Sverdlin, Alexander 1 sibling, 1 reply; 11+ messages in thread From: Vladimir Oltean @ 2024-09-12 10:15 UTC (permalink / raw) To: A. Sverdlin Cc: netdev, Andrew Lunn, Florian Fainelli, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, stable [-- Attachment #1: Type: text/plain, Size: 1261 bytes --] Hi Alexander, On Wed, Sep 11, 2024 at 04:40:03PM +0200, A. Sverdlin wrote: > From: Alexander Sverdlin <alexander.sverdlin@siemens.com> > > dsa_switch_shutdown() doesn't bring down any ports, but only disconnects > slaves from master. Packets still come afterwards into master port and the > ports are being polled for link status. This leads to crashes: > > Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 > CPU: 0 PID: 442 Comm: kworker/0:3 Tainted: G O 6.1.99+ #1 > Workqueue: events_power_efficient phy_state_machine > pc : lan9303_mdio_phy_read > lr : lan9303_phy_read > Call trace: > lan9303_mdio_phy_read > lan9303_phy_read > dsa_slave_phy_read > __mdiobus_read > mdiobus_read > genphy_update_link > genphy_read_status > phy_check_link_status > phy_state_machine > process_one_work > worker_thread > > Call lan9303_remove() instead to really unregister all ports before zeroing > drvdata and dsa_ptr. > > Fixes: 0650bf52b31f ("net: dsa: be compatible with masters which unregister on shutdown") > Cc: stable@vger.kernel.org > Signed-off-by: Alexander Sverdlin <alexander.sverdlin@siemens.com> > --- Could you please test this alternative solution (patch attached) for both reported problems? Thanks. [-- Attachment #2: 0001-net-dsa-improve-shutdown-sequence.patch --] [-- Type: text/x-diff, Size: 3696 bytes --] From e9ffae9325d1bf683e455f78ac1804b3612a4fd7 Mon Sep 17 00:00:00 2001 From: Vladimir Oltean <vladimir.oltean@nxp.com> Date: Thu, 12 Sep 2024 01:14:19 +0300 Subject: [PATCH] net: dsa: improve shutdown sequence Alexander Sverdlin presents 2 problems during shutdown with the lan9303 driver. One is specific to lan9303 and the other just happens to reproduce there. The first problem is that lan9303 is unique among DSA drivers in that it calls dev_get_drvdata() at "arbitrary runtime" (not probe, not shutdown, not remove): phy_state_machine() -> ... -> dsa_user_phy_read() -> ds->ops->phy_read() -> lan9303_phy_read() -> chip->ops->phy_read() -> lan9303_mdio_phy_read() -> dev_get_drvdata() But we never stop the phy_state_machine(), so it may continue to run after dsa_switch_shutdown(). Our common pattern in all DSA drivers is to set drvdata to NULL to suppress the remove() method that may come afterwards. But in this case it will result in an NPD. The second problem is that the way in which we set dp->conduit->dsa_ptr = NULL; is concurrent with receive packet processing. dsa_switch_rcv() checks once whether dev->dsa_ptr is NULL, but afterwards, rather than continuing to use that non-NULL value, dev->dsa_ptr is dereferenced again and again without NULL checks: dsa_conduit_find_user() and many other places. In between dereferences, there is no locking to ensure that what was valid once continues to be valid. Both problems have the common aspect that closing the conduit interface solves them. In the first case, dev_close(conduit) triggers the NETDEV_GOING_DOWN event in dsa_user_netdevice_event() which closes user ports as well. dsa_port_disable_rt() calls phylink_stop(), which synchronously stops the phylink state machine, and ds->ops->phy_read() will thus no longer call into the driver after this point. In the second case, dev_close(conduit) should do this, as per Documentation/networking/driver.rst: | Quiescence | ---------- | | After the ndo_stop routine has been called, the hardware must | not receive or transmit any data. All in flight packets must | be aborted. If necessary, poll or wait for completion of | any reset commands. So it should be sufficient to ensure that later, when we zeroize conduit->dsa_ptr, there will be no concurrent dsa_switch_rcv() call on this conduit. The addition of the netif_device_detach() function is to ensure that ioctls, rtnetlinks and ethtool requests on the user ports no longer propagate down to the driver - we're no longer prepared to handle them. Link: https://lore.kernel.org/netdev/2d2e3bba17203c14a5ffdabc174e3b6bbb9ad438.camel@siemens.com/ Link: https://lore.kernel.org/netdev/c1bf4de54e829111e0e4a70e7bd1cf523c9550ff.camel@siemens.com/ Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> --- net/dsa/dsa.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c index 668c729946ea..1664547deffd 100644 --- a/net/dsa/dsa.c +++ b/net/dsa/dsa.c @@ -1577,6 +1577,7 @@ EXPORT_SYMBOL_GPL(dsa_unregister_switch); void dsa_switch_shutdown(struct dsa_switch *ds) { struct net_device *conduit, *user_dev; + LIST_HEAD(close_list); struct dsa_port *dp; mutex_lock(&dsa2_mutex); @@ -1586,10 +1587,16 @@ void dsa_switch_shutdown(struct dsa_switch *ds) rtnl_lock(); + dsa_switch_for_each_cpu_port(dp, ds) + list_add(&dp->conduit->close_list, &close_list); + + dev_close_many(&close_list, true); + dsa_switch_for_each_user_port(dp, ds) { conduit = dsa_port_to_conduit(dp); user_dev = dp->user; + netif_device_detach(user_dev); netdev_upper_dev_unlink(conduit, user_dev); } -- 2.34.1 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH net] net: dsa: lan9303: avoid dsa_switch_shutdown() 2024-09-12 10:15 ` Vladimir Oltean @ 2024-09-13 16:16 ` Sverdlin, Alexander 2024-09-13 18:57 ` Vladimir Oltean 0 siblings, 1 reply; 11+ messages in thread From: Sverdlin, Alexander @ 2024-09-13 16:16 UTC (permalink / raw) To: olteanv@gmail.com Cc: andrew@lunn.ch, davem@davemloft.net, f.fainelli@gmail.com, pabeni@redhat.com, kuba@kernel.org, stable@vger.kernel.org, edumazet@google.com, netdev@vger.kernel.org Hi Vladimir! Thank you for the quick fix! On Thu, 2024-09-12 at 13:15 +0300, Vladimir Oltean wrote: > > From: Alexander Sverdlin <alexander.sverdlin@siemens.com> > > > > dsa_switch_shutdown() doesn't bring down any ports, but only disconnects > > slaves from master. Packets still come afterwards into master port and the > > ports are being polled for link status. This leads to crashes: > > > > Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 > > CPU: 0 PID: 442 Comm: kworker/0:3 Tainted: G O 6.1.99+ #1 > > Workqueue: events_power_efficient phy_state_machine > > pc : lan9303_mdio_phy_read > > lr : lan9303_phy_read > > Call trace: > > lan9303_mdio_phy_read > > lan9303_phy_read > > dsa_slave_phy_read > > __mdiobus_read > > mdiobus_read > > genphy_update_link > > genphy_read_status > > phy_check_link_status > > phy_state_machine > > process_one_work > > worker_thread > > > > Call lan9303_remove() instead to really unregister all ports before zeroing > > drvdata and dsa_ptr. > > > > Fixes: 0650bf52b31f ("net: dsa: be compatible with masters which unregister on shutdown") Do you think it would make sense to add the same Fixes: tag as above? (That's the earlier one of the two) > > Cc: stable@vger.kernel.org > > Signed-off-by: Alexander Sverdlin <alexander.sverdlin@siemens.com> > > --- > > Could you please test this alternative solution (patch attached) for both reported problems? We had two LAN9303-equipped systems running overnight with PROVE_LOCKING+PROVE_RCU and without, and I also ran couple of reboots with PROVE_LOCKING on a Marvell mv6xxx equipped HW. All of the above for a backport to v6.1, but this part should be OK, I believe. Overall looks very good, you can add my Reviewed-by: Alexander Sverdlin <alexander.sverdlin@siemens.com> Tested-by: Alexander Sverdlin <alexander.sverdlin@siemens.com> when you officially publish the patch. -- Alexander Sverdlin Siemens AG www.siemens.com ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net] net: dsa: lan9303: avoid dsa_switch_shutdown() 2024-09-13 16:16 ` Sverdlin, Alexander @ 2024-09-13 18:57 ` Vladimir Oltean 0 siblings, 0 replies; 11+ messages in thread From: Vladimir Oltean @ 2024-09-13 18:57 UTC (permalink / raw) To: Sverdlin, Alexander Cc: andrew@lunn.ch, davem@davemloft.net, f.fainelli@gmail.com, pabeni@redhat.com, kuba@kernel.org, stable@vger.kernel.org, edumazet@google.com, netdev@vger.kernel.org Hi Alexander, On Fri, Sep 13, 2024 at 04:16:57PM +0000, Sverdlin, Alexander wrote: > > > Fixes: 0650bf52b31f ("net: dsa: be compatible with masters which unregister on shutdown") > > Do you think it would make sense to add the same Fixes: tag as above? > (That's the earlier one of the two) > > > > Cc: stable@vger.kernel.org > > > Signed-off-by: Alexander Sverdlin <alexander.sverdlin@siemens.com> > > > --- > > > > Could you please test this alternative solution (patch attached) for both reported problems? > > We had two LAN9303-equipped systems running overnight with PROVE_LOCKING+PROVE_RCU and without, > and I also ran couple of reboots with PROVE_LOCKING on a Marvell mv6xxx equipped HW. > All of the above for a backport to v6.1, but this part should be OK, I believe. > > Overall looks very good, you can add my > Reviewed-by: Alexander Sverdlin <alexander.sverdlin@siemens.com> > Tested-by: Alexander Sverdlin <alexander.sverdlin@siemens.com> > when you officially publish the patch. Thanks for testing, I really appreciate it. I will add all the required tags for the formal patch submission. ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2024-09-13 18:57 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-09-04 9:07 [PATCH net] net: dsa: lan9303: avoid dsa_switch_shutdown() Alexander Sverdlin 2024-09-11 14:33 ` Jakub Kicinski -- strict thread matches above, loose matches on Subject: below -- 2024-09-11 14:40 A. Sverdlin 2024-09-11 16:28 ` Vladimir Oltean 2024-09-11 16:55 ` Sverdlin, Alexander 2024-09-11 17:04 ` Sverdlin, Alexander 2024-09-11 17:09 ` Sverdlin, Alexander 2024-09-11 17:19 ` Vladimir Oltean 2024-09-12 10:15 ` Vladimir Oltean 2024-09-13 16:16 ` Sverdlin, Alexander 2024-09-13 18:57 ` Vladimir Oltean
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).