All of lore.kernel.org
 help / color / mirror / Atom feed
From: Florian Fainelli <f.fainelli@gmail.com>
To: Heyi Guo <guoheyi@linux.alibaba.com>, linux-kernel@vger.kernel.org
Cc: Andrew Lunn <andrew@lunn.ch>,
	"David S. Miller" <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>, Joel Stanley <joel@jms.id.au>,
	Guangbin Huang <huangguangbin2@huawei.com>,
	Hao Chen <chenhao288@hisilicon.com>,
	Arnd Bergmann <arnd@arndb.de>,
	Dylan Hung <dylan_hung@aspeedtech.com>,
	netdev@vger.kernel.org
Subject: Re: [PATCH 3/3] drivers/net/ftgmac100: fix DHCP potential failure with systemd
Date: Wed, 23 Feb 2022 09:55:29 -0800	[thread overview]
Message-ID: <91e2d4ad-7544-784b-defe-3a76577462f1@gmail.com> (raw)
In-Reply-To: <5cdf5d09-9b32-ec98-cbd1-c05365ec01fa@linux.alibaba.com>



On 2/23/2022 3:39 AM, Heyi Guo wrote:
> Hi Florian,
> 
> 在 2022/2/23 下午1:00, Florian Fainelli 写道:
>>
>>
>> On 2/22/2022 7:14 PM, Heyi Guo wrote:
>>> DHCP failures were observed with systemd 247.6. The issue could be
>>> reproduced by rebooting Aspeed 2600 and then running ifconfig ethX
>>> down/up.
>>>
>>> It is caused by below procedures in the driver:
>>>
>>> 1. ftgmac100_open() enables net interface and call phy_start()
>>> 2. When PHY is link up, it calls netif_carrier_on() and then
>>> adjust_link callback
>>> 3. ftgmac100_adjust_link() will schedule the reset task
>>> 4. ftgmac100_reset_task() will then reset the MAC in another schedule
>>>
>>> After step 2, systemd will be notified to send DHCP discover packet,
>>> while the packet might be corrupted by MAC reset operation in step 4.
>>>
>>> Call ftgmac100_reset() directly instead of scheduling task to fix the
>>> issue.
>>>
>>> Signed-off-by: Heyi Guo <guoheyi@linux.alibaba.com>
>>> ---
>>> Cc: Andrew Lunn <andrew@lunn.ch>
>>> Cc: "David S. Miller" <davem@davemloft.net>
>>> Cc: Jakub Kicinski <kuba@kernel.org>
>>> Cc: Joel Stanley <joel@jms.id.au>
>>> Cc: Guangbin Huang <huangguangbin2@huawei.com>
>>> Cc: Hao Chen <chenhao288@hisilicon.com>
>>> Cc: Arnd Bergmann <arnd@arndb.de>
>>> Cc: Dylan Hung <dylan_hung@aspeedtech.com>
>>> Cc: netdev@vger.kernel.org
>>>
>>>
>>> ---
>>>   drivers/net/ethernet/faraday/ftgmac100.c | 13 +++++++++++--
>>>   1 file changed, 11 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/net/ethernet/faraday/ftgmac100.c 
>>> b/drivers/net/ethernet/faraday/ftgmac100.c
>>> index c1deb6e5d26c5..d5356db7539a4 100644
>>> --- a/drivers/net/ethernet/faraday/ftgmac100.c
>>> +++ b/drivers/net/ethernet/faraday/ftgmac100.c
>>> @@ -1402,8 +1402,17 @@ static void ftgmac100_adjust_link(struct 
>>> net_device *netdev)
>>>       /* Disable all interrupts */
>>>       iowrite32(0, priv->base + FTGMAC100_OFFSET_IER);
>>>   -    /* Reset the adapter asynchronously */
>>> -    schedule_work(&priv->reset_task);
>>> +    /* Release phy lock to allow ftgmac100_reset to aquire it, 
>>> keeping lock
>>
>> typo: acquire
>>
> Thanks for the catch :)
>>> +     * order consistent to prevent dead lock.
>>> +     */
>>> +    if (netdev->phydev)
>>> +        mutex_unlock(&netdev->phydev->lock);
>>> +
>>> +    ftgmac100_reset(priv);
>>> +
>>> +    if (netdev->phydev)
>>> +        mutex_lock(&netdev->phydev->lock);
>>
>> Do you really need to perform a full MAC reset whenever the link goes 
>> up or down? Instead cannot you just extract the maccr configuration 
>> which adjusts the speed and be done with it?
> 
> This is the original behavior and not changed in this patch set, and I'm 
> not familiar with the hardware design of ftgmac100, so I'd like to limit 
> the changes to the code which really causes practical issues.

This unlocking and re-locking seems superfluous when you could introduce 
a version of ftgmac100_reset() which does not acquire the PHY device 
mutex, and have that version called from ftgmac100_adjust_link(). For 
every other call site, you would acquire it. Something like this for 
instance:


diff --git a/drivers/net/ethernet/faraday/ftgmac100.c 
b/drivers/net/ethernet/faraday/ftgmac100.c
index 691605c15265..98179c3fd9ee 100644
--- a/drivers/net/ethernet/faraday/ftgmac100.c
+++ b/drivers/net/ethernet/faraday/ftgmac100.c
@@ -1038,7 +1038,7 @@ static void ftgmac100_adjust_link(struct 
net_device *netdev)
         iowrite32(0, priv->base + FTGMAC100_OFFSET_IER);

         /* Reset the adapter asynchronously */
-       schedule_work(&priv->reset_task);
+       ftgmac100_reset(priv, false);
  }

  static int ftgmac100_mii_probe(struct net_device *netdev)
@@ -1410,10 +1410,8 @@ static int ftgmac100_init_all(struct ftgmac100 
*priv, bool ignore_alloc_err)
         return err;
  }

-static void ftgmac100_reset_task(struct work_struct *work)
+static void ftgmac100_reset_task(struct ftgmac100_priv *priv, bool 
lock_phy)
  {
-       struct ftgmac100 *priv = container_of(work, struct ftgmac100,
-                                             reset_task);
         struct net_device *netdev = priv->netdev;
         int err;

@@ -1421,7 +1419,7 @@ static void ftgmac100_reset_task(struct 
work_struct *work)

         /* Lock the world */
         rtnl_lock();
-       if (netdev->phydev)
+       if (netdev->phydev && lock_phy)
                 mutex_lock(&netdev->phydev->lock);
         if (priv->mii_bus)
                 mutex_lock(&priv->mii_bus->mdio_lock);
@@ -1454,11 +1452,19 @@ static void ftgmac100_reset_task(struct 
work_struct *work)
   bail:
         if (priv->mii_bus)
                 mutex_unlock(&priv->mii_bus->mdio_lock);
-       if (netdev->phydev)
+       if (netdev->phydev && lock_phy)
                 mutex_unlock(&netdev->phydev->lock);
         rtnl_unlock();
  }

+static void ftgmac100_reset_task(struct work_struct *work)
+{
+       struct ftgmac100 *priv = container_of(work, struct ftgmac100,
+                                             reset_task);
+
+       ftgmac100_reset(priv, true);
+}
+
  static int ftgmac100_open(struct net_device *netdev)
  {
         struct ftgmac100 *priv = netdev_priv(netdev)
-- 
Florian

  reply	other threads:[~2022-02-23 17:55 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-23  3:14 [PATCH 0/3] drivers/net/ftgmac100: fix occasional DHCP failure Heyi Guo
2022-02-23  3:14 ` [PATCH 1/3] drivers/net/ftgmac100: refactor ftgmac100_reset_task to enable direct function call Heyi Guo
2022-02-23  3:14 ` [PATCH 2/3] drivers/net/ftgmac100: adjust code place for function call dependency Heyi Guo
2022-02-23  3:14 ` [PATCH 3/3] drivers/net/ftgmac100: fix DHCP potential failure with systemd Heyi Guo
2022-02-23  5:00   ` Florian Fainelli
2022-02-23 11:39     ` Heyi Guo
2022-02-23 17:55       ` Florian Fainelli [this message]
2022-02-23 18:05         ` Florian Fainelli
2022-02-24  2:52         ` Heyi Guo
2022-02-23 10:48   ` Andrew Lunn
2022-02-23 11:40     ` Heyi Guo
2022-02-23 13:20 ` [PATCH 0/3] drivers/net/ftgmac100: fix occasional DHCP failure patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=91e2d4ad-7544-784b-defe-3a76577462f1@gmail.com \
    --to=f.fainelli@gmail.com \
    --cc=andrew@lunn.ch \
    --cc=arnd@arndb.de \
    --cc=chenhao288@hisilicon.com \
    --cc=davem@davemloft.net \
    --cc=dylan_hung@aspeedtech.com \
    --cc=guoheyi@linux.alibaba.com \
    --cc=huangguangbin2@huawei.com \
    --cc=joel@jms.id.au \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.