netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net 0/2] net: ethernet: ftgmac100: fixes for ncsi/phy handling on device remove
@ 2024-10-28  4:54 Jeremy Kerr
  2024-10-28  4:54 ` [PATCH net 1/2] net: ethernet: ftgmac100: prevent use after free on unregister when using NCSI Jeremy Kerr
  2024-10-28  4:54 ` [PATCH net 2/2] net: ethernet: ftgmac100: fix NULL phy usage on device remove Jeremy Kerr
  0 siblings, 2 replies; 17+ messages in thread
From: Jeremy Kerr @ 2024-10-28  4:54 UTC (permalink / raw)
  To: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Joel Stanley, Jacky Chou, Jacob Keller
  Cc: netdev

We have a couple of bugs in the ftgmac remove path when used with ncsi,
when handling the ncsi_dev device, and now the new fixed-phy device
introduced in e24a6c874601.

Jacky: you may want to incorporate a remove test when running with NCSI
configurations:

    echo 1e660000.ethernet > /sys/class/net/eth0/device/driver/unbind

(and probably helpful on non-ncsi too, I guess!)

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
---
Jeremy Kerr (2):
      net: ethernet: ftgmac100: prevent use after free on unregister when using NCSI
      net: ethernet: ftgmac100: fix NULL phy usage on device remove

 drivers/net/ethernet/faraday/ftgmac100.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)
---
base-commit: e31a8219fbfcf9dc65ba1e1c10cade12b6754e00
change-id: 20241028-ftgmac-fixes-abffe9b49c36

Best regards,
-- 
Jeremy Kerr <jk@codeconstruct.com.au>


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH net 1/2] net: ethernet: ftgmac100: prevent use after free on unregister when using NCSI
  2024-10-28  4:54 [PATCH net 0/2] net: ethernet: ftgmac100: fixes for ncsi/phy handling on device remove Jeremy Kerr
@ 2024-10-28  4:54 ` Jeremy Kerr
  2024-10-28 18:33   ` Jacob Keller
  2024-10-28 20:15   ` Andrew Lunn
  2024-10-28  4:54 ` [PATCH net 2/2] net: ethernet: ftgmac100: fix NULL phy usage on device remove Jeremy Kerr
  1 sibling, 2 replies; 17+ messages in thread
From: Jeremy Kerr @ 2024-10-28  4:54 UTC (permalink / raw)
  To: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Joel Stanley, Jacky Chou, Jacob Keller
  Cc: netdev

When removing an active ftgmac100 netdev that is configured for NCSI, we
have a double free of the ncsi device: We currently unregister the ncsi
device (freeing it), then unregister the netdev itself. If the netdev is
running, the netdev_unregister() path performs a ->ndo_stop(), which
calls ncsi_stop_dev() on the now-free ncsi pointer.

Instead, modify ftgmac100_stop() to check the ncsi pointer before
freeing (rather than use_ncsi, which reflects configuration intent), and
clear the pointer once we have done the ncsi_unregister().

Fixes: 3d5179458d22 ("net: ftgmac100: Fix crash when removing driver")
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
---
 drivers/net/ethernet/faraday/ftgmac100.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/faraday/ftgmac100.c b/drivers/net/ethernet/faraday/ftgmac100.c
index 0b61f548fd188f64344ff67136c882a49598f6d3..9caee68468ff5f71d7ea63a0c8c9ec2be4a718bc 100644
--- a/drivers/net/ethernet/faraday/ftgmac100.c
+++ b/drivers/net/ethernet/faraday/ftgmac100.c
@@ -1599,7 +1599,7 @@ static int ftgmac100_stop(struct net_device *netdev)
 	netif_napi_del(&priv->napi);
 	if (netdev->phydev)
 		phy_stop(netdev->phydev);
-	if (priv->use_ncsi)
+	if (priv->ndev)
 		ncsi_stop_dev(priv->ndev);
 
 	ftgmac100_stop_hw(priv);
@@ -2059,8 +2059,10 @@ static void ftgmac100_remove(struct platform_device *pdev)
 	netdev = platform_get_drvdata(pdev);
 	priv = netdev_priv(netdev);
 
-	if (priv->ndev)
+	if (priv->ndev) {
 		ncsi_unregister_dev(priv->ndev);
+		priv->ndev = NULL;
+	}
 	unregister_netdev(netdev);
 
 	clk_disable_unprepare(priv->rclk);

-- 
2.39.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH net 2/2] net: ethernet: ftgmac100: fix NULL phy usage on device remove
  2024-10-28  4:54 [PATCH net 0/2] net: ethernet: ftgmac100: fixes for ncsi/phy handling on device remove Jeremy Kerr
  2024-10-28  4:54 ` [PATCH net 1/2] net: ethernet: ftgmac100: prevent use after free on unregister when using NCSI Jeremy Kerr
@ 2024-10-28  4:54 ` Jeremy Kerr
  2024-10-28  5:58   ` 回覆: " Jacky Chou
                     ` (2 more replies)
  1 sibling, 3 replies; 17+ messages in thread
From: Jeremy Kerr @ 2024-10-28  4:54 UTC (permalink / raw)
  To: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Joel Stanley, Jacky Chou, Jacob Keller
  Cc: netdev

Commit e24a6c874601 ("net: ftgmac100: Get link speed and duplex for
NC-SI") introduced a fixed phydev attached to the ftgmac netdev for ncsi
configurations, cleaned up on remove as:

    phy_disconnect(netdev->phydev);

    /* ... */

    if (priv->use_ncsi)
        fixed_phy_unregister(netdev->phydev);

However, phy_disconnect() will clear the netdev's ->phydev pointer, so
the fixed_phy_unregister() will always be invoked with a null pointer.

Use a temporary for the phydev, rather than expecting the netdev->phydev
point to be valid over the phy_disconnect().

Fixes: e24a6c874601 ("net: ftgmac100: Get link speed and duplex for NC-SI")
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
---
 drivers/net/ethernet/faraday/ftgmac100.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/faraday/ftgmac100.c b/drivers/net/ethernet/faraday/ftgmac100.c
index 9caee68468ff5f71d7ea63a0c8c9ec2be4a718bc..c6ed7ed0e2389a45a671b85ae60936df99458cd1 100644
--- a/drivers/net/ethernet/faraday/ftgmac100.c
+++ b/drivers/net/ethernet/faraday/ftgmac100.c
@@ -1730,16 +1730,17 @@ static int ftgmac100_setup_mdio(struct net_device *netdev)
 static void ftgmac100_phy_disconnect(struct net_device *netdev)
 {
 	struct ftgmac100 *priv = netdev_priv(netdev);
+	struct phy_device *phydev = netdev->phydev;
 
-	if (!netdev->phydev)
+	if (!phydev)
 		return;
 
-	phy_disconnect(netdev->phydev);
+	phy_disconnect(phydev);
 	if (of_phy_is_fixed_link(priv->dev->of_node))
 		of_phy_deregister_fixed_link(priv->dev->of_node);
 
 	if (priv->use_ncsi)
-		fixed_phy_unregister(netdev->phydev);
+		fixed_phy_unregister(phydev);
 }
 
 static void ftgmac100_destroy_mdio(struct net_device *netdev)

-- 
2.39.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* 回覆: [PATCH net 2/2] net: ethernet: ftgmac100: fix NULL phy usage on device remove
  2024-10-28  4:54 ` [PATCH net 2/2] net: ethernet: ftgmac100: fix NULL phy usage on device remove Jeremy Kerr
@ 2024-10-28  5:58   ` Jacky Chou
  2024-10-28 18:34   ` Jacob Keller
  2024-10-28 20:23   ` Andrew Lunn
  2 siblings, 0 replies; 17+ messages in thread
From: Jacky Chou @ 2024-10-28  5:58 UTC (permalink / raw)
  To: Jeremy Kerr, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Joel Stanley, Jacob Keller
  Cc: netdev@vger.kernel.org

Hi Jeremy,

I checked the phy_disconnect function again.
It will assign NULL pointer to netdev->phydev after calling itself.
In NC-SI mode, it does cause a crash when calling fixed_phy_unregister(netdev->phydev).
Thank you for your help in pointing out my mistack.

Thanks,
Jacky

Reviewed-by: Jacky Chou <jacky_chou@aspeedtech.com>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net 1/2] net: ethernet: ftgmac100: prevent use after free on unregister when using NCSI
  2024-10-28  4:54 ` [PATCH net 1/2] net: ethernet: ftgmac100: prevent use after free on unregister when using NCSI Jeremy Kerr
@ 2024-10-28 18:33   ` Jacob Keller
  2024-10-28 20:15   ` Andrew Lunn
  1 sibling, 0 replies; 17+ messages in thread
From: Jacob Keller @ 2024-10-28 18:33 UTC (permalink / raw)
  To: Jeremy Kerr, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Joel Stanley, Jacky Chou
  Cc: netdev



On 10/27/2024 9:54 PM, Jeremy Kerr wrote:
> When removing an active ftgmac100 netdev that is configured for NCSI, we
> have a double free of the ncsi device: We currently unregister the ncsi
> device (freeing it), then unregister the netdev itself. If the netdev is
> running, the netdev_unregister() path performs a ->ndo_stop(), which
> calls ncsi_stop_dev() on the now-free ncsi pointer.
> 
> Instead, modify ftgmac100_stop() to check the ncsi pointer before
> freeing (rather than use_ncsi, which reflects configuration intent), and
> clear the pointer once we have done the ncsi_unregister().
> 
> Fixes: 3d5179458d22 ("net: ftgmac100: Fix crash when removing driver")
> Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
> ---

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net 2/2] net: ethernet: ftgmac100: fix NULL phy usage on device remove
  2024-10-28  4:54 ` [PATCH net 2/2] net: ethernet: ftgmac100: fix NULL phy usage on device remove Jeremy Kerr
  2024-10-28  5:58   ` 回覆: " Jacky Chou
@ 2024-10-28 18:34   ` Jacob Keller
  2024-10-28 20:23   ` Andrew Lunn
  2 siblings, 0 replies; 17+ messages in thread
From: Jacob Keller @ 2024-10-28 18:34 UTC (permalink / raw)
  To: Jeremy Kerr, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Joel Stanley, Jacky Chou
  Cc: netdev



On 10/27/2024 9:54 PM, Jeremy Kerr wrote:
> Commit e24a6c874601 ("net: ftgmac100: Get link speed and duplex for
> NC-SI") introduced a fixed phydev attached to the ftgmac netdev for ncsi
> configurations, cleaned up on remove as:
> 
>     phy_disconnect(netdev->phydev);
> 
>     /* ... */
> 
>     if (priv->use_ncsi)
>         fixed_phy_unregister(netdev->phydev);
> 
> However, phy_disconnect() will clear the netdev's ->phydev pointer, so
> the fixed_phy_unregister() will always be invoked with a null pointer.
> 
> Use a temporary for the phydev, rather than expecting the netdev->phydev
> point to be valid over the phy_disconnect().
> 
> Fixes: e24a6c874601 ("net: ftgmac100: Get link speed and duplex for NC-SI")
> Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
> ---

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net 1/2] net: ethernet: ftgmac100: prevent use after free on unregister when using NCSI
  2024-10-28  4:54 ` [PATCH net 1/2] net: ethernet: ftgmac100: prevent use after free on unregister when using NCSI Jeremy Kerr
  2024-10-28 18:33   ` Jacob Keller
@ 2024-10-28 20:15   ` Andrew Lunn
  2024-10-29  4:32     ` Jeremy Kerr
  1 sibling, 1 reply; 17+ messages in thread
From: Andrew Lunn @ 2024-10-28 20:15 UTC (permalink / raw)
  To: Jeremy Kerr
  Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Joel Stanley, Jacky Chou, Jacob Keller, netdev

On Mon, Oct 28, 2024 at 12:54:10PM +0800, Jeremy Kerr wrote:
> When removing an active ftgmac100 netdev that is configured for NCSI, we
> have a double free of the ncsi device: We currently unregister the ncsi
> device (freeing it), then unregister the netdev itself. If the netdev is
> running, the netdev_unregister() path performs a ->ndo_stop(), which
> calls ncsi_stop_dev() on the now-free ncsi pointer.
> 
> Instead, modify ftgmac100_stop() to check the ncsi pointer before
> freeing (rather than use_ncsi, which reflects configuration intent), and
> clear the pointer once we have done the ncsi_unregister().

This seems like the wrong fix. I would expect ftgmac100_stop() to be a
mirror of ftgmac100_open(). So unregistering in ndo_stop is correct,
and the real double free bug is in ftgmac100_remove().
ftgmac100_remove() should be a mirror of ftgmac100_probe() which does
not register the ncsi device....

    Andrew

---
pw-bot: cr

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net 2/2] net: ethernet: ftgmac100: fix NULL phy usage on device remove
  2024-10-28  4:54 ` [PATCH net 2/2] net: ethernet: ftgmac100: fix NULL phy usage on device remove Jeremy Kerr
  2024-10-28  5:58   ` 回覆: " Jacky Chou
  2024-10-28 18:34   ` Jacob Keller
@ 2024-10-28 20:23   ` Andrew Lunn
  2024-10-29  4:36     ` Jeremy Kerr
  2 siblings, 1 reply; 17+ messages in thread
From: Andrew Lunn @ 2024-10-28 20:23 UTC (permalink / raw)
  To: Jeremy Kerr
  Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Joel Stanley, Jacky Chou, Jacob Keller, netdev

On Mon, Oct 28, 2024 at 12:54:11PM +0800, Jeremy Kerr wrote:
> Commit e24a6c874601 ("net: ftgmac100: Get link speed and duplex for
> NC-SI") introduced a fixed phydev attached to the ftgmac netdev for ncsi
> configurations, cleaned up on remove as:
> 
>     phy_disconnect(netdev->phydev);
> 
>     /* ... */
> 
>     if (priv->use_ncsi)
>         fixed_phy_unregister(netdev->phydev);
> 
> However, phy_disconnect() will clear the netdev's ->phydev pointer, so
> the fixed_phy_unregister() will always be invoked with a null pointer.
> 
> Use a temporary for the phydev, rather than expecting the netdev->phydev
> point to be valid over the phy_disconnect().
> 
> Fixes: e24a6c874601 ("net: ftgmac100: Get link speed and duplex for NC-SI")
> Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
> ---
>  drivers/net/ethernet/faraday/ftgmac100.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/ethernet/faraday/ftgmac100.c b/drivers/net/ethernet/faraday/ftgmac100.c
> index 9caee68468ff5f71d7ea63a0c8c9ec2be4a718bc..c6ed7ed0e2389a45a671b85ae60936df99458cd1 100644
> --- a/drivers/net/ethernet/faraday/ftgmac100.c
> +++ b/drivers/net/ethernet/faraday/ftgmac100.c
> @@ -1730,16 +1730,17 @@ static int ftgmac100_setup_mdio(struct net_device *netdev)
>  static void ftgmac100_phy_disconnect(struct net_device *netdev)

This all seems rather hacky. What is the mirror function to
ftgmac100_phy_disconnect(). I don't see a
ftgmac100_phy_connect(). Generally, if the teardown function does the
same as the setup function, but reverse order, it just works....

     Andrew

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net 1/2] net: ethernet: ftgmac100: prevent use after free on unregister when using NCSI
  2024-10-28 20:15   ` Andrew Lunn
@ 2024-10-29  4:32     ` Jeremy Kerr
  2024-10-29 12:37       ` Andrew Lunn
  2024-10-29 22:36       ` Jakub Kicinski
  0 siblings, 2 replies; 17+ messages in thread
From: Jeremy Kerr @ 2024-10-29  4:32 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Joel Stanley, Jacky Chou, Jacob Keller, netdev

Hi Andrew,

> ftgmac100_remove() should be a mirror of ftgmac100_probe() which does
> not register the ncsi device....

Sure it does:

    static int ftgmac100_probe(struct platform_device *pdev)
    {

        /* ... */

        if (np && of_get_property(np, "use-ncsi", NULL)) {
                if (!IS_ENABLED(CONFIG_NET_NCSI)) {
                        dev_err(&pdev->dev, "NCSI stack not enabled\n");
                        err = -EINVAL;
                        goto err_phy_connect;
                }

                dev_info(&pdev->dev, "Using NCSI interface\n");
                priv->use_ncsi = true;
 =>             priv->ndev = ncsi_register_dev(netdev, ftgmac100_ncsi_handler);
                if (!priv->ndev) {
                        err = -EINVAL;
                        goto err_phy_connect;
                }
- so we're symmetrical in that regard.

On unbind, ->remove is called before ->ndo_stop, as the latter is
invoked through the unregister_netdev():

    [   62.869014] Call trace: 
    [   62.869079]  unwind_backtrace from show_stack+0x18/0x1c
    [   62.869386]  show_stack from dump_stack_lvl+0x68/0x74
    [   62.869575]  dump_stack_lvl from print_report+0x130/0x4d8
    [   62.869771]  print_report from kasan_report+0xa8/0xe8
    [   62.869956]  kasan_report from detach_if_pending+0x49c/0x518
    [   62.870156]  detach_if_pending from timer_delete+0xc4/0x124
    [   62.870350]  timer_delete from work_grab_pending+0x8c/0x8e4
    [   62.870543]  work_grab_pending from __cancel_work+0x84/0x25c
    [   62.870744]  __cancel_work from __cancel_work_sync+0x1c/0x130
    [   62.870930]  __cancel_work_sync from phy_stop+0x118/0x268
    [   62.871114]  phy_stop from ftgmac100_stop+0x160/0x2dc
    [   62.871289]  ftgmac100_stop from __dev_close_many+0x1c8/0x300
    [   62.871481]  __dev_close_many from dev_close_many+0x238/0x578
    [   62.871674]  dev_close_many from unregister_netdevice_many_notify+0x460/0x2368
    [   62.871900]  unregister_netdevice_many_notify from unregister_netdevice_queue+0x27c/0x32c
    [   62.872144]  unregister_netdevice_queue from unregister_netdev+0x20/0x28
    [   62.872348]  unregister_netdev from ftgmac100_remove+0x8c/0x24c
    [   62.872542]  ftgmac100_remove from platform_remove+0x84/0xa4
    [   62.872730]  platform_remove from device_release_driver_internal+0x428/0x5e4
    [   62.872952]  device_release_driver_internal from unbind_store+0xb8/0x108
    [   62.873163]  unbind_store from kernfs_fop_write_iter+0x3a4/0x590
    [   62.873364]  kernfs_fop_write_iter from vfs_write+0x65c/0xec8
    [   62.873567]  vfs_write from ksys_write+0xec/0x1d4
    [   62.873735]  ksys_write from ret_fast_syscall+0x0/0x54

As the ordering in ftgmac100_remove() is:


        if (priv->ndev)
                ncsi_unregister_dev(priv->ndev);
        unregister_netdev(netdev);

which, is (I assume intentionally) symmetric with the _probe, which
does:

                priv->ndev = ncsi_register_dev(netdev, ftgmac100_ncsi_handler);

        /* ... */

        register_netdev(netdev)

So we would either re-order _remove() to do the ncsi_unregister() after
the unregister_netdev(), breaking the symmetry there, or we check for a
valid ncsi device in ->ndo_stop. I have chosen the latter for this
change.

Cheers,


Jeremy

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net 2/2] net: ethernet: ftgmac100: fix NULL phy usage on device remove
  2024-10-28 20:23   ` Andrew Lunn
@ 2024-10-29  4:36     ` Jeremy Kerr
  2024-10-29 12:41       ` Andrew Lunn
  0 siblings, 1 reply; 17+ messages in thread
From: Jeremy Kerr @ 2024-10-29  4:36 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Joel Stanley, Jacky Chou, Jacob Keller, netdev

Hi Andrew,
> This all seems rather hacky. What is the mirror function to
> ftgmac100_phy_disconnect(). I don't see a
> ftgmac100_phy_connect().

There are different paths in physical-phy vs ncsi, so they're
implemented differently in ftgmac100_probe() based on those
configurations.

If you feel the driver needs a rework for the phy setup, that's fine,
but I assume that's not something we want to add as a backported change
in the net tree.

Cheers,


Jeremy

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net 1/2] net: ethernet: ftgmac100: prevent use after free on unregister when using NCSI
  2024-10-29  4:32     ` Jeremy Kerr
@ 2024-10-29 12:37       ` Andrew Lunn
  2024-10-29 14:10         ` Jeremy Kerr
  2024-10-29 22:36       ` Jakub Kicinski
  1 sibling, 1 reply; 17+ messages in thread
From: Andrew Lunn @ 2024-10-29 12:37 UTC (permalink / raw)
  To: Jeremy Kerr
  Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Joel Stanley, Jacky Chou, Jacob Keller, netdev

On Tue, Oct 29, 2024 at 12:32:53PM +0800, Jeremy Kerr wrote:
> Hi Andrew,
> 
> > ftgmac100_remove() should be a mirror of ftgmac100_probe() which does
> > not register the ncsi device....
> 
> Sure it does:
> 
>     static int ftgmac100_probe(struct platform_device *pdev)
>     {
> 
>         /* ... */
> 
>         if (np && of_get_property(np, "use-ncsi", NULL)) {
>                 if (!IS_ENABLED(CONFIG_NET_NCSI)) {
>                         dev_err(&pdev->dev, "NCSI stack not enabled\n");
>                         err = -EINVAL;
>                         goto err_phy_connect;
>                 }
> 
>                 dev_info(&pdev->dev, "Using NCSI interface\n");
>                 priv->use_ncsi = true;
>  =>             priv->ndev = ncsi_register_dev(netdev, ftgmac100_ncsi_handler);
>                 if (!priv->ndev) {
>                         err = -EINVAL;
>                         goto err_phy_connect;
>                 }

Ah, OK, i missed that.

However, _open and _stop are not mirrors.

For ftgmac100_open():

        if (netdev->phydev) {
                /* If we have a PHY, start polling */
                phy_start(netdev->phydev);
        }
        if (priv->use_ncsi) {
                /* If using NC-SI, set our carrier on and start the stack */
                netif_carrier_on(netdev);

                /* Start the NCSI device */
                err = ncsi_start_dev(priv->ndev);
                if (err)
                        goto err_ncsi;
        }


ftgmac100_stop

        if (netdev->phydev)
                phy_stop(netdev->phydev);
        if (priv->use_ncsi)
                ncsi_stop_dev(priv->ndev);

The order should be reversed, you undo in the opposite order to what
you do. This is probably not the issue you are having, but it does
show this driver has ordering issues. If you solve the ordering
issues, i expect your problem goes away.

	Andrew

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net 2/2] net: ethernet: ftgmac100: fix NULL phy usage on device remove
  2024-10-29  4:36     ` Jeremy Kerr
@ 2024-10-29 12:41       ` Andrew Lunn
  0 siblings, 0 replies; 17+ messages in thread
From: Andrew Lunn @ 2024-10-29 12:41 UTC (permalink / raw)
  To: Jeremy Kerr
  Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Joel Stanley, Jacky Chou, Jacob Keller, netdev

On Tue, Oct 29, 2024 at 12:36:44PM +0800, Jeremy Kerr wrote:
> Hi Andrew,
> > This all seems rather hacky. What is the mirror function to
> > ftgmac100_phy_disconnect(). I don't see a
> > ftgmac100_phy_connect().
> 
> There are different paths in physical-phy vs ncsi, so they're
> implemented differently in ftgmac100_probe() based on those
> configurations.
> 
> If you feel the driver needs a rework for the phy setup, that's fine,
> but I assume that's not something we want to add as a backported change
> in the net tree.

Lets see how big those changes are. If a fix needs a patcheset of lots
of small obviously correct changes, GregKH will accept them for
stable.

	Andrew

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net 1/2] net: ethernet: ftgmac100: prevent use after free on unregister when using NCSI
  2024-10-29 12:37       ` Andrew Lunn
@ 2024-10-29 14:10         ` Jeremy Kerr
  0 siblings, 0 replies; 17+ messages in thread
From: Jeremy Kerr @ 2024-10-29 14:10 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Joel Stanley, Jacky Chou, Jacob Keller, netdev

Hi Andrew,


> The order should be reversed, you undo in the opposite order to what
> you do. This is probably not the issue you are having,

No, it's not - the crash occurs before the introduction of the phydev
for ncsi configurations, so the order didn't matter in that case.

> but it does show this driver has ordering issues. If you solve the
> ordering issues, i expect your problem goes away.

"solving the ordering issues" is probably best done by the driver's
maintainers; I have very few facilities for testing the various
configurations that this driver supports.

I'm just sending a couple of minimal patches for crashes I have seen.
All good if you'd like to explore some further driver surgery, but I'm
not the one to do that.

Cheers,


Jeremy

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net 1/2] net: ethernet: ftgmac100: prevent use after free on unregister when using NCSI
  2024-10-29  4:32     ` Jeremy Kerr
  2024-10-29 12:37       ` Andrew Lunn
@ 2024-10-29 22:36       ` Jakub Kicinski
  2024-10-30  0:29         ` Jeremy Kerr
  1 sibling, 1 reply; 17+ messages in thread
From: Jakub Kicinski @ 2024-10-29 22:36 UTC (permalink / raw)
  To: Jeremy Kerr
  Cc: Andrew Lunn, Andrew Lunn, David S. Miller, Eric Dumazet,
	Paolo Abeni, Joel Stanley, Jacky Chou, Jacob Keller, netdev

On Tue, 29 Oct 2024 12:32:53 +0800 Jeremy Kerr wrote:
> As the ordering in ftgmac100_remove() is:
> 
> 
>         if (priv->ndev)
>                 ncsi_unregister_dev(priv->ndev);
>         unregister_netdev(netdev);
> 
> which, is (I assume intentionally) symmetric with the _probe, which
> does:
> 
>                 priv->ndev = ncsi_register_dev(netdev, ftgmac100_ncsi_handler);
> 
>         /* ... */
> 
>         register_netdev(netdev)

To be clear - symmetric means mirror image.
I agree with Andrew.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net 1/2] net: ethernet: ftgmac100: prevent use after free on unregister when using NCSI
  2024-10-29 22:36       ` Jakub Kicinski
@ 2024-10-30  0:29         ` Jeremy Kerr
  2024-10-30  2:58           ` Jeremy Kerr
  0 siblings, 1 reply; 17+ messages in thread
From: Jeremy Kerr @ 2024-10-30  0:29 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Andrew Lunn, Andrew Lunn, David S. Miller, Eric Dumazet,
	Paolo Abeni, Joel Stanley, Jacky Chou, Jacob Keller, netdev

Hi Jakub,

> > As the ordering in ftgmac100_remove() is:
> > 
> > 
> >         if (priv->ndev)
> >                 ncsi_unregister_dev(priv->ndev);
> >         unregister_netdev(netdev);
> > 
> > which, is (I assume intentionally) symmetric with the _probe, which
> > does:
> > 
> >                 priv->ndev = ncsi_register_dev(netdev, ftgmac100_ncsi_handler);
> > 
> >         /* ... */
> > 
> >         register_netdev(netdev)
> 
> To be clear - symmetric means mirror image.

Totally agree with you both on the concept, but I had flipped the
unregister order in my head, sorry!

So, this worth a try with the _remove sequence reordered with respect
to the ncsi device. I'll work on a replacement patch to see if that can
be done without other fallout, and will send a follow-up soon.

Cheers,


Jeremy


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net 1/2] net: ethernet: ftgmac100: prevent use after free on unregister when using NCSI
  2024-10-30  0:29         ` Jeremy Kerr
@ 2024-10-30  2:58           ` Jeremy Kerr
  2024-10-30  9:02             ` Sam Mendoza-Jonas
  0 siblings, 1 reply; 17+ messages in thread
From: Jeremy Kerr @ 2024-10-30  2:58 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Andrew Lunn, Andrew Lunn, David S. Miller, Eric Dumazet,
	Paolo Abeni, Joel Stanley, Jacky Chou, Jacob Keller, netdev,
	Samuel Mendoza-Jonas

Hi all,

[+Sam, as ncsi maintainer]

> So, this worth a try with the _remove sequence reordered with respect
> to the ncsi device. I'll work on a replacement patch to see if that
> can be done without other fallout, and will send a follow-up soon.

OK, not so simple. ftgmac100_probe does a:
  
    ncsi_register_dev()
      -> dev_add_pack()
    register_netdev()

- where ptype.dev is the ftgmac netdev.

So we'd want to restructure the _remove to do:

    unregister_netdev()
    ncsi_unregister_dev()
     -> dev_remove_pack()

However, we (sensibly) can't do the unregister_netdev() with a
packet-type handler still in place.

Sam: would it make sense to move the dev_add_pack() / dev_remove_pack()
to the ncsi layer's ncsi_start_dev() / ncsi_stop_dev() ? The channel
monitor is disabled on stop, so we shouldn't expect to receive any
further NCSI ethertype packets.

Cheers,


Jeremy

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net 1/2] net: ethernet: ftgmac100: prevent use after free on unregister when using NCSI
  2024-10-30  2:58           ` Jeremy Kerr
@ 2024-10-30  9:02             ` Sam Mendoza-Jonas
  0 siblings, 0 replies; 17+ messages in thread
From: Sam Mendoza-Jonas @ 2024-10-30  9:02 UTC (permalink / raw)
  To: Jeremy Kerr, Jakub Kicinski
  Cc: Andrew Lunn, Andrew Lunn, David S. Miller, Eric Dumazet,
	Paolo Abeni, Joel Stanley, Jacky Chou, Jacob Keller, netdev

On 10/30/2024 1:58 PM, Jeremy Kerr wrote:
> Hi all,
>
> [+Sam, as ncsi maintainer]
>
>> So, this worth a try with the _remove sequence reordered with respect
>> to the ncsi device. I'll work on a replacement patch to see if that
>> can be done without other fallout, and will send a follow-up soon.
> OK, not so simple. ftgmac100_probe does a:
>    
>      ncsi_register_dev()
>        -> dev_add_pack()
>      register_netdev()
>
> - where ptype.dev is the ftgmac netdev.
>
> So we'd want to restructure the _remove to do:
>
>      unregister_netdev()
>      ncsi_unregister_dev()
>       -> dev_remove_pack()
>
> However, we (sensibly) can't do the unregister_netdev() with a
> packet-type handler still in place.
>
> Sam: would it make sense to move the dev_add_pack() / dev_remove_pack()
> to the ncsi layer's ncsi_start_dev() / ncsi_stop_dev() ? The channel
> monitor is disabled on stop, so we shouldn't expect to receive any
> further NCSI ethertype packets.

Having a quick re-read of those and ncsi_reset_dev() that looks fine on 
the surface; as you say outside of those no packet handling is happening 
and the link is forced down.

Cheers,
Sam


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2024-10-30  9:02 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-28  4:54 [PATCH net 0/2] net: ethernet: ftgmac100: fixes for ncsi/phy handling on device remove Jeremy Kerr
2024-10-28  4:54 ` [PATCH net 1/2] net: ethernet: ftgmac100: prevent use after free on unregister when using NCSI Jeremy Kerr
2024-10-28 18:33   ` Jacob Keller
2024-10-28 20:15   ` Andrew Lunn
2024-10-29  4:32     ` Jeremy Kerr
2024-10-29 12:37       ` Andrew Lunn
2024-10-29 14:10         ` Jeremy Kerr
2024-10-29 22:36       ` Jakub Kicinski
2024-10-30  0:29         ` Jeremy Kerr
2024-10-30  2:58           ` Jeremy Kerr
2024-10-30  9:02             ` Sam Mendoza-Jonas
2024-10-28  4:54 ` [PATCH net 2/2] net: ethernet: ftgmac100: fix NULL phy usage on device remove Jeremy Kerr
2024-10-28  5:58   ` 回覆: " Jacky Chou
2024-10-28 18:34   ` Jacob Keller
2024-10-28 20:23   ` Andrew Lunn
2024-10-29  4:36     ` Jeremy Kerr
2024-10-29 12:41       ` Andrew Lunn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).