* [PATCH net 1/3] net: hibmcge: fix rx buf avl irq is not re-enabled in irq_handle issue
2025-10-21 14:00 [PATCH net 0/3] There are some bugfix for hibmcge ethernet driver Jijie Shao
@ 2025-10-21 14:00 ` Jijie Shao
2025-10-24 1:15 ` Jacob Keller
2025-10-21 14:00 ` [PATCH net 2/3] net: hibmcge: remove unnecessary check for np_link_fail in scenarios without phy Jijie Shao
` (2 subsequent siblings)
3 siblings, 1 reply; 14+ messages in thread
From: Jijie Shao @ 2025-10-21 14:00 UTC (permalink / raw)
To: davem, edumazet, kuba, pabeni, andrew+netdev, horms
Cc: shenjian15, liuyonglong, chenhao418, lantao5, huangdonghua3,
yangshuaisong, jonathan.cameron, salil.mehta, netdev,
linux-kernel, shaojijie
irq initialized with the macro HBG_ERR_IRQ_I will automatically
be re-enabled, whereas those initialized with the macro HBG_IRQ_I
will not be re-enabled.
Since the rx buf avl irq is initialized using the macro HBG_IRQ_I,
it needs to be actively re-enabled.
Fixes: fd394a334b1c ("net: hibmcge: Add support for abnormal irq handling feature")
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
---
drivers/net/ethernet/hisilicon/hibmcge/hbg_irq.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/hisilicon/hibmcge/hbg_irq.c b/drivers/net/ethernet/hisilicon/hibmcge/hbg_irq.c
index 8af0bc4cca21..ae4cb35186d8 100644
--- a/drivers/net/ethernet/hisilicon/hibmcge/hbg_irq.c
+++ b/drivers/net/ethernet/hisilicon/hibmcge/hbg_irq.c
@@ -32,6 +32,7 @@ static void hbg_irq_handle_rx_buf_val(struct hbg_priv *priv,
const struct hbg_irq_info *irq_info)
{
priv->stats.rx_fifo_less_empty_thrsld_cnt++;
+ hbg_hw_irq_enable(priv, irq_info->mask, true);
}
#define HBG_IRQ_I(name, handle) \
--
2.33.0
^ permalink raw reply related [flat|nested] 14+ messages in thread* Re: [PATCH net 1/3] net: hibmcge: fix rx buf avl irq is not re-enabled in irq_handle issue
2025-10-21 14:00 ` [PATCH net 1/3] net: hibmcge: fix rx buf avl irq is not re-enabled in irq_handle issue Jijie Shao
@ 2025-10-24 1:15 ` Jacob Keller
2025-10-24 6:39 ` Jijie Shao
0 siblings, 1 reply; 14+ messages in thread
From: Jacob Keller @ 2025-10-24 1:15 UTC (permalink / raw)
To: Jijie Shao, davem, edumazet, kuba, pabeni, andrew+netdev, horms
Cc: shenjian15, liuyonglong, chenhao418, lantao5, huangdonghua3,
yangshuaisong, jonathan.cameron, salil.mehta, netdev,
linux-kernel
[-- Attachment #1.1: Type: text/plain, Size: 1647 bytes --]
On 10/21/2025 7:00 AM, Jijie Shao wrote:
> irq initialized with the macro HBG_ERR_IRQ_I will automatically
> be re-enabled, whereas those initialized with the macro HBG_IRQ_I
> will not be re-enabled.
>
> Since the rx buf avl irq is initialized using the macro HBG_IRQ_I,
> it needs to be actively re-enabled.
>
This seems like it would be quite a severe issue. Do you have
reproduction or example of what the failure state looks like?
From the fixed commit, the RX_BUF_AVL used to be HBG_ERR_IRQ_I but now
it uses HBG_IRQ_I so that it can have its own custom handler.. but
HBG_IRQ_I doesn't set re_enable to true...
It seems like a better fix would be having an HBG_ERR_IRQ_I variant that
lets you pass your own function instead of making the handler have to do
the hbg_hw_irq_enable call in its handler?
> Fixes: fd394a334b1c ("net: hibmcge: Add support for abnormal irq handling feature")
> Signed-off-by: Jijie Shao <shaojijie@huawei.com>
> ---
> drivers/net/ethernet/hisilicon/hibmcge/hbg_irq.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/net/ethernet/hisilicon/hibmcge/hbg_irq.c b/drivers/net/ethernet/hisilicon/hibmcge/hbg_irq.c
> index 8af0bc4cca21..ae4cb35186d8 100644
> --- a/drivers/net/ethernet/hisilicon/hibmcge/hbg_irq.c
> +++ b/drivers/net/ethernet/hisilicon/hibmcge/hbg_irq.c
> @@ -32,6 +32,7 @@ static void hbg_irq_handle_rx_buf_val(struct hbg_priv *priv,
> const struct hbg_irq_info *irq_info)
> {
> priv->stats.rx_fifo_less_empty_thrsld_cnt++;
> + hbg_hw_irq_enable(priv, irq_info->mask, true);
> }
>
> #define HBG_IRQ_I(name, handle) \
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 236 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [PATCH net 1/3] net: hibmcge: fix rx buf avl irq is not re-enabled in irq_handle issue
2025-10-24 1:15 ` Jacob Keller
@ 2025-10-24 6:39 ` Jijie Shao
2025-10-24 20:23 ` Jacob Keller
0 siblings, 1 reply; 14+ messages in thread
From: Jijie Shao @ 2025-10-24 6:39 UTC (permalink / raw)
To: Jacob Keller, davem, edumazet, kuba, pabeni, andrew+netdev, horms
Cc: shaojijie, shenjian15, liuyonglong, chenhao418, lantao5,
huangdonghua3, yangshuaisong, jonathan.cameron, salil.mehta,
netdev, linux-kernel
on 2025/10/24 9:15, Jacob Keller wrote:
>
> On 10/21/2025 7:00 AM, Jijie Shao wrote:
>> irq initialized with the macro HBG_ERR_IRQ_I will automatically
>> be re-enabled, whereas those initialized with the macro HBG_IRQ_I
>> will not be re-enabled.
>>
>> Since the rx buf avl irq is initialized using the macro HBG_IRQ_I,
>> it needs to be actively re-enabled.
>>
> This seems like it would be quite a severe issue. Do you have
> reproduction or example of what the failure state looks like?
priv->stats.rx_fifo_less_empty_thrsld_cnt can only be increased to 1
and cannot be increased further.
It is not a very serious issue, it affects the accuracy of a statistical item.
>
> From the fixed commit, the RX_BUF_AVL used to be HBG_ERR_IRQ_I but now
> it uses HBG_IRQ_I so that it can have its own custom handler.. but
> HBG_IRQ_I doesn't set re_enable to true...
>
> It seems like a better fix would be having an HBG_ERR_IRQ_I variant that
> lets you pass your own function instead of making the handler have to do
> the hbg_hw_irq_enable call in its handler?
Currently, only the RX_BUF_AVL interrupt needs to be enabled separately.
Personally, I think it is acceptable to temporarily not use the an HBG_ERR_IRQ_I variant.
>
>> Fixes: fd394a334b1c ("net: hibmcge: Add support for abnormal irq handling feature")
>> Signed-off-by: Jijie Shao <shaojijie@huawei.com>
>> ---
>> drivers/net/ethernet/hisilicon/hibmcge/hbg_irq.c | 1 +
>> 1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/net/ethernet/hisilicon/hibmcge/hbg_irq.c b/drivers/net/ethernet/hisilicon/hibmcge/hbg_irq.c
>> index 8af0bc4cca21..ae4cb35186d8 100644
>> --- a/drivers/net/ethernet/hisilicon/hibmcge/hbg_irq.c
>> +++ b/drivers/net/ethernet/hisilicon/hibmcge/hbg_irq.c
>> @@ -32,6 +32,7 @@ static void hbg_irq_handle_rx_buf_val(struct hbg_priv *priv,
>> const struct hbg_irq_info *irq_info)
>> {
>> priv->stats.rx_fifo_less_empty_thrsld_cnt++;
>> + hbg_hw_irq_enable(priv, irq_info->mask, true);
>> }
>>
>> #define HBG_IRQ_I(name, handle) \
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [PATCH net 1/3] net: hibmcge: fix rx buf avl irq is not re-enabled in irq_handle issue
2025-10-24 6:39 ` Jijie Shao
@ 2025-10-24 20:23 ` Jacob Keller
0 siblings, 0 replies; 14+ messages in thread
From: Jacob Keller @ 2025-10-24 20:23 UTC (permalink / raw)
To: Jijie Shao, davem, edumazet, kuba, pabeni, andrew+netdev, horms
Cc: shenjian15, liuyonglong, chenhao418, lantao5, huangdonghua3,
yangshuaisong, jonathan.cameron, salil.mehta, netdev,
linux-kernel
[-- Attachment #1.1: Type: text/plain, Size: 1494 bytes --]
On 10/23/2025 11:39 PM, Jijie Shao wrote:
>
> on 2025/10/24 9:15, Jacob Keller wrote:
>>
>> On 10/21/2025 7:00 AM, Jijie Shao wrote:
>>> irq initialized with the macro HBG_ERR_IRQ_I will automatically
>>> be re-enabled, whereas those initialized with the macro HBG_IRQ_I
>>> will not be re-enabled.
>>>
>>> Since the rx buf avl irq is initialized using the macro HBG_IRQ_I,
>>> it needs to be actively re-enabled.
>>>
>> This seems like it would be quite a severe issue. Do you have
>> reproduction or example of what the failure state looks like?
>
> priv->stats.rx_fifo_less_empty_thrsld_cnt can only be increased to 1
> and cannot be increased further.
>
> It is not a very serious issue, it affects the accuracy of a statistical item.
>
Right, since it only affects this one cause. Got it.
>>
>> From the fixed commit, the RX_BUF_AVL used to be HBG_ERR_IRQ_I but now
>> it uses HBG_IRQ_I so that it can have its own custom handler.. but
>> HBG_IRQ_I doesn't set re_enable to true...
>>
>> It seems like a better fix would be having an HBG_ERR_IRQ_I variant that
>> lets you pass your own function instead of making the handler have to do
>> the hbg_hw_irq_enable call in its handler?
>
> Currently, only the RX_BUF_AVL interrupt needs to be enabled separately.
> Personally, I think it is acceptable to temporarily not use the an HBG_ERR_IRQ_I variant.
>
Sure that seems reasonable.
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 236 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH net 2/3] net: hibmcge: remove unnecessary check for np_link_fail in scenarios without phy.
2025-10-21 14:00 [PATCH net 0/3] There are some bugfix for hibmcge ethernet driver Jijie Shao
2025-10-21 14:00 ` [PATCH net 1/3] net: hibmcge: fix rx buf avl irq is not re-enabled in irq_handle issue Jijie Shao
@ 2025-10-21 14:00 ` Jijie Shao
2025-10-24 1:10 ` Jacob Keller
2025-10-21 14:00 ` [PATCH net 3/3] net: hibmcge: fix the inappropriate netif_device_detach() Jijie Shao
2025-10-24 1:08 ` [PATCH net 0/3] There are some bugfix for hibmcge ethernet driver Jacob Keller
3 siblings, 1 reply; 14+ messages in thread
From: Jijie Shao @ 2025-10-21 14:00 UTC (permalink / raw)
To: davem, edumazet, kuba, pabeni, andrew+netdev, horms
Cc: shenjian15, liuyonglong, chenhao418, lantao5, huangdonghua3,
yangshuaisong, jonathan.cameron, salil.mehta, netdev,
linux-kernel, shaojijie
hibmcge driver uses fixed_phy to configure scenarios without PHY,
where the driver is always in a linked state. However,
there might be no link in hardware, so the np_link error
is detected in hbg_hw_adjust_link(), which can cause abnormal logs.
Therefore, in scenarios without a PHY, the driver no longer
checks the np_link status.
Fixes: 1d7cd7a9c69c ("net: hibmcge: support scenario without PHY")
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
---
drivers/net/ethernet/hisilicon/hibmcge/hbg_common.h | 1 +
drivers/net/ethernet/hisilicon/hibmcge/hbg_hw.c | 3 +++
drivers/net/ethernet/hisilicon/hibmcge/hbg_mdio.c | 1 -
3 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/hisilicon/hibmcge/hbg_common.h b/drivers/net/ethernet/hisilicon/hibmcge/hbg_common.h
index ea09a09c451b..2097e4c2b3d7 100644
--- a/drivers/net/ethernet/hisilicon/hibmcge/hbg_common.h
+++ b/drivers/net/ethernet/hisilicon/hibmcge/hbg_common.h
@@ -17,6 +17,7 @@
#define HBG_PCU_CACHE_LINE_SIZE 32
#define HBG_TX_TIMEOUT_BUF_LEN 1024
#define HBG_RX_DESCR 0x01
+#define HBG_NO_PHY 0xFF
#define HBG_PACKET_HEAD_SIZE ((HBG_RX_SKIP1 + HBG_RX_SKIP2 + \
HBG_RX_DESCR) * HBG_PCU_CACHE_LINE_SIZE)
diff --git a/drivers/net/ethernet/hisilicon/hibmcge/hbg_hw.c b/drivers/net/ethernet/hisilicon/hibmcge/hbg_hw.c
index d0aa0661ecd4..d6e8ce8e351a 100644
--- a/drivers/net/ethernet/hisilicon/hibmcge/hbg_hw.c
+++ b/drivers/net/ethernet/hisilicon/hibmcge/hbg_hw.c
@@ -244,6 +244,9 @@ void hbg_hw_adjust_link(struct hbg_priv *priv, u32 speed, u32 duplex)
hbg_hw_mac_enable(priv, HBG_STATUS_ENABLE);
+ if (priv->mac.phy_addr == HBG_NO_PHY)
+ return;
+
/* wait MAC link up */
ret = readl_poll_timeout(priv->io_base + HBG_REG_AN_NEG_STATE_ADDR,
link_status,
diff --git a/drivers/net/ethernet/hisilicon/hibmcge/hbg_mdio.c b/drivers/net/ethernet/hisilicon/hibmcge/hbg_mdio.c
index 37791de47f6f..b6f0a2780ea8 100644
--- a/drivers/net/ethernet/hisilicon/hibmcge/hbg_mdio.c
+++ b/drivers/net/ethernet/hisilicon/hibmcge/hbg_mdio.c
@@ -20,7 +20,6 @@
#define HBG_MDIO_OP_INTERVAL_US (5 * 1000)
#define HBG_NP_LINK_FAIL_RETRY_TIMES 5
-#define HBG_NO_PHY 0xFF
static void hbg_mdio_set_command(struct hbg_mac *mac, u32 cmd)
{
--
2.33.0
^ permalink raw reply related [flat|nested] 14+ messages in thread* Re: [PATCH net 2/3] net: hibmcge: remove unnecessary check for np_link_fail in scenarios without phy.
2025-10-21 14:00 ` [PATCH net 2/3] net: hibmcge: remove unnecessary check for np_link_fail in scenarios without phy Jijie Shao
@ 2025-10-24 1:10 ` Jacob Keller
2025-10-24 6:44 ` Jijie Shao
0 siblings, 1 reply; 14+ messages in thread
From: Jacob Keller @ 2025-10-24 1:10 UTC (permalink / raw)
To: Jijie Shao, davem, edumazet, kuba, pabeni, andrew+netdev, horms
Cc: shenjian15, liuyonglong, chenhao418, lantao5, huangdonghua3,
yangshuaisong, jonathan.cameron, salil.mehta, netdev,
linux-kernel
[-- Attachment #1.1: Type: text/plain, Size: 2669 bytes --]
On 10/21/2025 7:00 AM, Jijie Shao wrote:
> hibmcge driver uses fixed_phy to configure scenarios without PHY,
> where the driver is always in a linked state. However,
> there might be no link in hardware, so the np_link error
> is detected in hbg_hw_adjust_link(), which can cause abnormal logs.
>
Perhaps I am missing something here. You mention the driver is always in
a linked state, but that there could be no link in hardware?
I'm not sure I properly understand whats going wrong here..
> Therefore, in scenarios without a PHY, the driver no longer
> checks the np_link status.
>
> Fixes: 1d7cd7a9c69c ("net: hibmcge: support scenario without PHY")
> Signed-off-by: Jijie Shao <shaojijie@huawei.com>
> ---
> drivers/net/ethernet/hisilicon/hibmcge/hbg_common.h | 1 +
> drivers/net/ethernet/hisilicon/hibmcge/hbg_hw.c | 3 +++
> drivers/net/ethernet/hisilicon/hibmcge/hbg_mdio.c | 1 -
> 3 files changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/hisilicon/hibmcge/hbg_common.h b/drivers/net/ethernet/hisilicon/hibmcge/hbg_common.h
> index ea09a09c451b..2097e4c2b3d7 100644
> --- a/drivers/net/ethernet/hisilicon/hibmcge/hbg_common.h
> +++ b/drivers/net/ethernet/hisilicon/hibmcge/hbg_common.h
> @@ -17,6 +17,7 @@
> #define HBG_PCU_CACHE_LINE_SIZE 32
> #define HBG_TX_TIMEOUT_BUF_LEN 1024
> #define HBG_RX_DESCR 0x01
> +#define HBG_NO_PHY 0xFF
>
> #define HBG_PACKET_HEAD_SIZE ((HBG_RX_SKIP1 + HBG_RX_SKIP2 + \
> HBG_RX_DESCR) * HBG_PCU_CACHE_LINE_SIZE)
> diff --git a/drivers/net/ethernet/hisilicon/hibmcge/hbg_hw.c b/drivers/net/ethernet/hisilicon/hibmcge/hbg_hw.c
> index d0aa0661ecd4..d6e8ce8e351a 100644
> --- a/drivers/net/ethernet/hisilicon/hibmcge/hbg_hw.c
> +++ b/drivers/net/ethernet/hisilicon/hibmcge/hbg_hw.c
> @@ -244,6 +244,9 @@ void hbg_hw_adjust_link(struct hbg_priv *priv, u32 speed, u32 duplex)
>
> hbg_hw_mac_enable(priv, HBG_STATUS_ENABLE);
>
> + if (priv->mac.phy_addr == HBG_NO_PHY)
> + return;
> +
> /* wait MAC link up */
> ret = readl_poll_timeout(priv->io_base + HBG_REG_AN_NEG_STATE_ADDR,
> link_status,
> diff --git a/drivers/net/ethernet/hisilicon/hibmcge/hbg_mdio.c b/drivers/net/ethernet/hisilicon/hibmcge/hbg_mdio.c
> index 37791de47f6f..b6f0a2780ea8 100644
> --- a/drivers/net/ethernet/hisilicon/hibmcge/hbg_mdio.c
> +++ b/drivers/net/ethernet/hisilicon/hibmcge/hbg_mdio.c
> @@ -20,7 +20,6 @@
> #define HBG_MDIO_OP_INTERVAL_US (5 * 1000)
>
> #define HBG_NP_LINK_FAIL_RETRY_TIMES 5
> -#define HBG_NO_PHY 0xFF
>
> static void hbg_mdio_set_command(struct hbg_mac *mac, u32 cmd)
> {
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 236 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [PATCH net 2/3] net: hibmcge: remove unnecessary check for np_link_fail in scenarios without phy.
2025-10-24 1:10 ` Jacob Keller
@ 2025-10-24 6:44 ` Jijie Shao
0 siblings, 0 replies; 14+ messages in thread
From: Jijie Shao @ 2025-10-24 6:44 UTC (permalink / raw)
To: Jacob Keller, davem, edumazet, kuba, pabeni, andrew+netdev, horms
Cc: shaojijie, shenjian15, liuyonglong, chenhao418, lantao5,
huangdonghua3, yangshuaisong, jonathan.cameron, salil.mehta,
netdev, linux-kernel
on 2025/10/24 9:10, Jacob Keller wrote:
>
> On 10/21/2025 7:00 AM, Jijie Shao wrote:
>> hibmcge driver uses fixed_phy to configure scenarios without PHY,
>> where the driver is always in a linked state. However,
>> there might be no link in hardware, so the np_link error
>> is detected in hbg_hw_adjust_link(), which can cause abnormal logs.
>>
> Perhaps I am missing something here. You mention the driver is always in
> a linked state, but that there could be no link in hardware?
>
> I'm not sure I properly understand whats going wrong here..
No, fixed_phy is a fake PHY that is always in link state and will call adjust_link().
If you are interested, you can take a look at the code for initializing the fixed PHY
in the hibmcge driver: hbg_fixed_phy_init()
Thanks,
Jijie Shao
>
>> Therefore, in scenarios without a PHY, the driver no longer
>> checks the np_link status.
>>
>> Fixes: 1d7cd7a9c69c ("net: hibmcge: support scenario without PHY")
>> Signed-off-by: Jijie Shao <shaojijie@huawei.com>
>> ---
>> drivers/net/ethernet/hisilicon/hibmcge/hbg_common.h | 1 +
>> drivers/net/ethernet/hisilicon/hibmcge/hbg_hw.c | 3 +++
>> drivers/net/ethernet/hisilicon/hibmcge/hbg_mdio.c | 1 -
>> 3 files changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/ethernet/hisilicon/hibmcge/hbg_common.h b/drivers/net/ethernet/hisilicon/hibmcge/hbg_common.h
>> index ea09a09c451b..2097e4c2b3d7 100644
>> --- a/drivers/net/ethernet/hisilicon/hibmcge/hbg_common.h
>> +++ b/drivers/net/ethernet/hisilicon/hibmcge/hbg_common.h
>> @@ -17,6 +17,7 @@
>> #define HBG_PCU_CACHE_LINE_SIZE 32
>> #define HBG_TX_TIMEOUT_BUF_LEN 1024
>> #define HBG_RX_DESCR 0x01
>> +#define HBG_NO_PHY 0xFF
>>
>> #define HBG_PACKET_HEAD_SIZE ((HBG_RX_SKIP1 + HBG_RX_SKIP2 + \
>> HBG_RX_DESCR) * HBG_PCU_CACHE_LINE_SIZE)
>> diff --git a/drivers/net/ethernet/hisilicon/hibmcge/hbg_hw.c b/drivers/net/ethernet/hisilicon/hibmcge/hbg_hw.c
>> index d0aa0661ecd4..d6e8ce8e351a 100644
>> --- a/drivers/net/ethernet/hisilicon/hibmcge/hbg_hw.c
>> +++ b/drivers/net/ethernet/hisilicon/hibmcge/hbg_hw.c
>> @@ -244,6 +244,9 @@ void hbg_hw_adjust_link(struct hbg_priv *priv, u32 speed, u32 duplex)
>>
>> hbg_hw_mac_enable(priv, HBG_STATUS_ENABLE);
>>
>> + if (priv->mac.phy_addr == HBG_NO_PHY)
>> + return;
>> +
>> /* wait MAC link up */
>> ret = readl_poll_timeout(priv->io_base + HBG_REG_AN_NEG_STATE_ADDR,
>> link_status,
>> diff --git a/drivers/net/ethernet/hisilicon/hibmcge/hbg_mdio.c b/drivers/net/ethernet/hisilicon/hibmcge/hbg_mdio.c
>> index 37791de47f6f..b6f0a2780ea8 100644
>> --- a/drivers/net/ethernet/hisilicon/hibmcge/hbg_mdio.c
>> +++ b/drivers/net/ethernet/hisilicon/hibmcge/hbg_mdio.c
>> @@ -20,7 +20,6 @@
>> #define HBG_MDIO_OP_INTERVAL_US (5 * 1000)
>>
>> #define HBG_NP_LINK_FAIL_RETRY_TIMES 5
>> -#define HBG_NO_PHY 0xFF
>>
>> static void hbg_mdio_set_command(struct hbg_mac *mac, u32 cmd)
>> {
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH net 3/3] net: hibmcge: fix the inappropriate netif_device_detach()
2025-10-21 14:00 [PATCH net 0/3] There are some bugfix for hibmcge ethernet driver Jijie Shao
2025-10-21 14:00 ` [PATCH net 1/3] net: hibmcge: fix rx buf avl irq is not re-enabled in irq_handle issue Jijie Shao
2025-10-21 14:00 ` [PATCH net 2/3] net: hibmcge: remove unnecessary check for np_link_fail in scenarios without phy Jijie Shao
@ 2025-10-21 14:00 ` Jijie Shao
2025-10-24 1:05 ` Jacob Keller
2025-10-24 1:08 ` [PATCH net 0/3] There are some bugfix for hibmcge ethernet driver Jacob Keller
3 siblings, 1 reply; 14+ messages in thread
From: Jijie Shao @ 2025-10-21 14:00 UTC (permalink / raw)
To: davem, edumazet, kuba, pabeni, andrew+netdev, horms
Cc: shenjian15, liuyonglong, chenhao418, lantao5, huangdonghua3,
yangshuaisong, jonathan.cameron, salil.mehta, netdev,
linux-kernel, shaojijie
current, driver will call netif_device_detach() in
pci_error_handlers.error_detected() and do reset in
pci_error_handlers.slot_reset().
However, if pci_error_handlers.slot_reset() is not called
after pci_error_handlers.error_detected(),
driver will be detached and unable to recover.
drivers/pci/pcie/err.c/report_error_detected() says:
If any device in the subtree does not have an error_detected
callback, PCI_ERS_RESULT_NO_AER_DRIVER prevents subsequent
error callbacks of any device in the subtree, and will
exit in the disconnected error state.
Therefore, when the hibmcge device and other devices that do not
support the error_detected callback are under the same subtree,
hibmcge will be unable to do slot_reset.
This path move netif_device_detach from error_detected to slot_reset,
ensuring that detach and reset are always executed together.
Fixes: fd394a334b1c ("net: hibmcge: Add support for abnormal irq handling feature")
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
---
drivers/net/ethernet/hisilicon/hibmcge/hbg_err.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hibmcge/hbg_err.c b/drivers/net/ethernet/hisilicon/hibmcge/hbg_err.c
index 83cf75bf7a17..e11495b7ee98 100644
--- a/drivers/net/ethernet/hisilicon/hibmcge/hbg_err.c
+++ b/drivers/net/ethernet/hisilicon/hibmcge/hbg_err.c
@@ -136,12 +136,11 @@ static pci_ers_result_t hbg_pci_err_detected(struct pci_dev *pdev,
{
struct net_device *netdev = pci_get_drvdata(pdev);
- netif_device_detach(netdev);
-
- if (state == pci_channel_io_perm_failure)
+ if (state == pci_channel_io_perm_failure) {
+ netif_device_detach(netdev);
return PCI_ERS_RESULT_DISCONNECT;
+ }
- pci_disable_device(pdev);
return PCI_ERS_RESULT_NEED_RESET;
}
@@ -150,6 +149,9 @@ static pci_ers_result_t hbg_pci_err_slot_reset(struct pci_dev *pdev)
struct net_device *netdev = pci_get_drvdata(pdev);
struct hbg_priv *priv = netdev_priv(netdev);
+ netif_device_detach(netdev);
+ pci_disable_device(pdev);
+
if (pci_enable_device(pdev)) {
dev_err(&pdev->dev,
"failed to re-enable PCI device after reset\n");
--
2.33.0
^ permalink raw reply related [flat|nested] 14+ messages in thread* Re: [PATCH net 3/3] net: hibmcge: fix the inappropriate netif_device_detach()
2025-10-21 14:00 ` [PATCH net 3/3] net: hibmcge: fix the inappropriate netif_device_detach() Jijie Shao
@ 2025-10-24 1:05 ` Jacob Keller
2025-10-24 7:21 ` Jijie Shao
0 siblings, 1 reply; 14+ messages in thread
From: Jacob Keller @ 2025-10-24 1:05 UTC (permalink / raw)
To: Jijie Shao, davem, edumazet, kuba, pabeni, andrew+netdev, horms
Cc: shenjian15, liuyonglong, chenhao418, lantao5, huangdonghua3,
yangshuaisong, jonathan.cameron, salil.mehta, netdev,
linux-kernel
[-- Attachment #1.1: Type: text/plain, Size: 2763 bytes --]
On 10/21/2025 7:00 AM, Jijie Shao wrote:
> current, driver will call netif_device_detach() in
> pci_error_handlers.error_detected() and do reset in
> pci_error_handlers.slot_reset().
> However, if pci_error_handlers.slot_reset() is not called
> after pci_error_handlers.error_detected(),
> driver will be detached and unable to recover.
>
> drivers/pci/pcie/err.c/report_error_detected() says:
> If any device in the subtree does not have an error_detected
> callback, PCI_ERS_RESULT_NO_AER_DRIVER prevents subsequent
> error callbacks of any device in the subtree, and will
> exit in the disconnected error state.
>
> Therefore, when the hibmcge device and other devices that do not
> support the error_detected callback are under the same subtree,
> hibmcge will be unable to do slot_reset.
>
Hmm.
In the example case, the slot_reset never happens, but the PCI device is
still in an error state, which means that the device is not functional..
In that case detaching the netdev and remaining detached seems like an
expected outcome?
I guess I don't fully understand the setup in this scenario.
> This path move netif_device_detach from error_detected to slot_reset,
> ensuring that detach and reset are always executed together.
>
> Fixes: fd394a334b1c ("net: hibmcge: Add support for abnormal irq handling feature")
> Signed-off-by: Jijie Shao <shaojijie@huawei.com>
> ---
> drivers/net/ethernet/hisilicon/hibmcge/hbg_err.c | 10 ++++++----
> 1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/ethernet/hisilicon/hibmcge/hbg_err.c b/drivers/net/ethernet/hisilicon/hibmcge/hbg_err.c
> index 83cf75bf7a17..e11495b7ee98 100644
> --- a/drivers/net/ethernet/hisilicon/hibmcge/hbg_err.c
> +++ b/drivers/net/ethernet/hisilicon/hibmcge/hbg_err.c
> @@ -136,12 +136,11 @@ static pci_ers_result_t hbg_pci_err_detected(struct pci_dev *pdev,
> {
> struct net_device *netdev = pci_get_drvdata(pdev);
>
> - netif_device_detach(netdev);
> -
> - if (state == pci_channel_io_perm_failure)
> + if (state == pci_channel_io_perm_failure) {
> + netif_device_detach(netdev);
> return PCI_ERS_RESULT_DISCONNECT;
> + }
>
> - pci_disable_device(pdev);
> return PCI_ERS_RESULT_NEED_RESET;
> }
>
> @@ -150,6 +149,9 @@ static pci_ers_result_t hbg_pci_err_slot_reset(struct pci_dev *pdev)
> struct net_device *netdev = pci_get_drvdata(pdev);
> struct hbg_priv *priv = netdev_priv(netdev);
>
> + netif_device_detach(netdev);
> + pci_disable_device(pdev);
> +
> if (pci_enable_device(pdev)) {
> dev_err(&pdev->dev,
> "failed to re-enable PCI device after reset\n");
Here, we disable the device only to immediately attempt to re-enable it?
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 236 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [PATCH net 3/3] net: hibmcge: fix the inappropriate netif_device_detach()
2025-10-24 1:05 ` Jacob Keller
@ 2025-10-24 7:21 ` Jijie Shao
2025-10-24 20:24 ` Jacob Keller
0 siblings, 1 reply; 14+ messages in thread
From: Jijie Shao @ 2025-10-24 7:21 UTC (permalink / raw)
To: Jacob Keller, davem, edumazet, kuba, pabeni, andrew+netdev, horms
Cc: shaojijie, shenjian15, liuyonglong, chenhao418, lantao5,
huangdonghua3, yangshuaisong, jonathan.cameron, salil.mehta,
netdev, linux-kernel
on 2025/10/24 9:05, Jacob Keller wrote:
>
> On 10/21/2025 7:00 AM, Jijie Shao wrote:
>> current, driver will call netif_device_detach() in
>> pci_error_handlers.error_detected() and do reset in
>> pci_error_handlers.slot_reset().
>> However, if pci_error_handlers.slot_reset() is not called
>> after pci_error_handlers.error_detected(),
>> driver will be detached and unable to recover.
>>
>> drivers/pci/pcie/err.c/report_error_detected() says:
>> If any device in the subtree does not have an error_detected
>> callback, PCI_ERS_RESULT_NO_AER_DRIVER prevents subsequent
>> error callbacks of any device in the subtree, and will
>> exit in the disconnected error state.
>>
>> Therefore, when the hibmcge device and other devices that do not
>> support the error_detected callback are under the same subtree,
>> hibmcge will be unable to do slot_reset.
>>
> Hmm.
>
> In the example case, the slot_reset never happens, but the PCI device is
> still in an error state, which means that the device is not functional..
>
> In that case detaching the netdev and remaining detached seems like an
> expected outcome?
>
> I guess I don't fully understand the setup in this scenario.
We have encountered some non-fatal errors, such as the SMMU event 0x10,
which triggered the PCIe RAS and caused the network port to become unusable.
the event does not significantly affect the normal use of the network port,
so it is unreasonable to say that the network port cannot be used.
We do our best to ensure the normal use of the network ports;
otherwise, unless there is a serial port,
it will no longer be possible to connect to the board.
To prevent such issues, I move netif_device_detach() from error_detected() to slot_reset().
Either do netif_device_detach() and error_detected(), otherwise do nothing.
>
>> This path move netif_device_detach from error_detected to slot_reset,
>> ensuring that detach and reset are always executed together.
>>
>> Fixes: fd394a334b1c ("net: hibmcge: Add support for abnormal irq handling feature")
>> Signed-off-by: Jijie Shao <shaojijie@huawei.com>
>> ---
>> drivers/net/ethernet/hisilicon/hibmcge/hbg_err.c | 10 ++++++----
>> 1 file changed, 6 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/hisilicon/hibmcge/hbg_err.c b/drivers/net/ethernet/hisilicon/hibmcge/hbg_err.c
>> index 83cf75bf7a17..e11495b7ee98 100644
>> --- a/drivers/net/ethernet/hisilicon/hibmcge/hbg_err.c
>> +++ b/drivers/net/ethernet/hisilicon/hibmcge/hbg_err.c
>> @@ -136,12 +136,11 @@ static pci_ers_result_t hbg_pci_err_detected(struct pci_dev *pdev,
>> {
>> struct net_device *netdev = pci_get_drvdata(pdev);
>>
>> - netif_device_detach(netdev);
>> -
>> - if (state == pci_channel_io_perm_failure)
>> + if (state == pci_channel_io_perm_failure) {
>> + netif_device_detach(netdev);
>> return PCI_ERS_RESULT_DISCONNECT;
>> + }
>>
>> - pci_disable_device(pdev);
>> return PCI_ERS_RESULT_NEED_RESET;
>> }
>>
>> @@ -150,6 +149,9 @@ static pci_ers_result_t hbg_pci_err_slot_reset(struct pci_dev *pdev)
>> struct net_device *netdev = pci_get_drvdata(pdev);
>> struct hbg_priv *priv = netdev_priv(netdev);
>>
>> + netif_device_detach(netdev);
>> + pci_disable_device(pdev);
>> +
>> if (pci_enable_device(pdev)) {
>> dev_err(&pdev->dev,
>> "failed to re-enable PCI device after reset\n");
> Here, we disable the device only to immediately attempt to re-enable it?
Yes, the driver attempts to re-enable the PCIe device.
Thanks,
Jijie Shao
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [PATCH net 3/3] net: hibmcge: fix the inappropriate netif_device_detach()
2025-10-24 7:21 ` Jijie Shao
@ 2025-10-24 20:24 ` Jacob Keller
0 siblings, 0 replies; 14+ messages in thread
From: Jacob Keller @ 2025-10-24 20:24 UTC (permalink / raw)
To: Jijie Shao, davem, edumazet, kuba, pabeni, andrew+netdev, horms
Cc: shenjian15, liuyonglong, chenhao418, lantao5, huangdonghua3,
yangshuaisong, jonathan.cameron, salil.mehta, netdev,
linux-kernel
[-- Attachment #1.1: Type: text/plain, Size: 1534 bytes --]
On 10/24/2025 12:21 AM, Jijie Shao wrote:
>
> on 2025/10/24 9:05, Jacob Keller wrote:
>>
>> On 10/21/2025 7:00 AM, Jijie Shao wrote:
>>> current, driver will call netif_device_detach() in
>>> pci_error_handlers.error_detected() and do reset in
>>> pci_error_handlers.slot_reset().
>>> However, if pci_error_handlers.slot_reset() is not called
>>> after pci_error_handlers.error_detected(),
>>> driver will be detached and unable to recover.
>>>
>>> drivers/pci/pcie/err.c/report_error_detected() says:
>>> If any device in the subtree does not have an error_detected
>>> callback, PCI_ERS_RESULT_NO_AER_DRIVER prevents subsequent
>>> error callbacks of any device in the subtree, and will
>>> exit in the disconnected error state.
>>>
>>> Therefore, when the hibmcge device and other devices that do not
>>> support the error_detected callback are under the same subtree,
>>> hibmcge will be unable to do slot_reset.
>>>
>> Hmm.
>>
>> In the example case, the slot_reset never happens, but the PCI device is
>> still in an error state, which means that the device is not functional..
>>
>> In that case detaching the netdev and remaining detached seems like an
>> expected outcome?
>>
>> I guess I don't fully understand the setup in this scenario.
>
> We have encountered some non-fatal errors, such as the SMMU event 0x10,
> which triggered the PCIe RAS and caused the network port to become unusable.
>
Right. I forgot the same function is called even for non-fatal errors.
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 236 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH net 0/3] There are some bugfix for hibmcge ethernet driver
2025-10-21 14:00 [PATCH net 0/3] There are some bugfix for hibmcge ethernet driver Jijie Shao
` (2 preceding siblings ...)
2025-10-21 14:00 ` [PATCH net 3/3] net: hibmcge: fix the inappropriate netif_device_detach() Jijie Shao
@ 2025-10-24 1:08 ` Jacob Keller
2025-10-24 7:24 ` Jijie Shao
3 siblings, 1 reply; 14+ messages in thread
From: Jacob Keller @ 2025-10-24 1:08 UTC (permalink / raw)
To: Jijie Shao, davem, edumazet, kuba, pabeni, andrew+netdev, horms
Cc: shenjian15, liuyonglong, chenhao418, lantao5, huangdonghua3,
yangshuaisong, jonathan.cameron, salil.mehta, netdev,
linux-kernel
[-- Attachment #1.1: Type: text/plain, Size: 1128 bytes --]
On 10/21/2025 7:00 AM, Jijie Shao wrote:
> This patch set is intended to fix several issues for hibmcge driver:
> 1. Patch1 fixes the issue where buf avl irq is disabled after irq_handle.
> 2. Patch2 eliminates the error logs in scenarios without phy.
> 3. Patch3 fixes the issue where the network port becomes unusable
> after a PCIe RAS event.
>
>
We typically suggest using imperative wording for the subject such as
"bug fixes for the hibmcge ethernet driver".
> Jijie Shao (3):
> net: hibmcge: fix rx buf avl irq is not re-enabled in irq_handle issue
> net: hibmcge: remove unnecessary check for np_link_fail in scenarios
> without phy.
> net: hibmcge: fix the inappropriate netif_device_detach()
>
> drivers/net/ethernet/hisilicon/hibmcge/hbg_common.h | 1 +
> drivers/net/ethernet/hisilicon/hibmcge/hbg_err.c | 10 ++++++----
> drivers/net/ethernet/hisilicon/hibmcge/hbg_hw.c | 3 +++
> drivers/net/ethernet/hisilicon/hibmcge/hbg_irq.c | 1 +
> drivers/net/ethernet/hisilicon/hibmcge/hbg_mdio.c | 1 -
> 5 files changed, 11 insertions(+), 5 deletions(-)
>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 236 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [PATCH net 0/3] There are some bugfix for hibmcge ethernet driver
2025-10-24 1:08 ` [PATCH net 0/3] There are some bugfix for hibmcge ethernet driver Jacob Keller
@ 2025-10-24 7:24 ` Jijie Shao
0 siblings, 0 replies; 14+ messages in thread
From: Jijie Shao @ 2025-10-24 7:24 UTC (permalink / raw)
To: Jacob Keller, davem, edumazet, kuba, pabeni, andrew+netdev, horms
Cc: shaojijie, shenjian15, liuyonglong, chenhao418, lantao5,
huangdonghua3, yangshuaisong, jonathan.cameron, salil.mehta,
netdev, linux-kernel
on 2025/10/24 9:08, Jacob Keller wrote:
>
> On 10/21/2025 7:00 AM, Jijie Shao wrote:
>> This patch set is intended to fix several issues for hibmcge driver:
>> 1. Patch1 fixes the issue where buf avl irq is disabled after irq_handle.
>> 2. Patch2 eliminates the error logs in scenarios without phy.
>> 3. Patch3 fixes the issue where the network port becomes unusable
>> after a PCIe RAS event.
>>
>>
> We typically suggest using imperative wording for the subject such as
> "bug fixes for the hibmcge ethernet driver".
Ok, I will change in V2
Thanks,
Jijie Shao
>> Jijie Shao (3):
>> net: hibmcge: fix rx buf avl irq is not re-enabled in irq_handle issue
>> net: hibmcge: remove unnecessary check for np_link_fail in scenarios
>> without phy.
>> net: hibmcge: fix the inappropriate netif_device_detach()
>>
>> drivers/net/ethernet/hisilicon/hibmcge/hbg_common.h | 1 +
>> drivers/net/ethernet/hisilicon/hibmcge/hbg_err.c | 10 ++++++----
>> drivers/net/ethernet/hisilicon/hibmcge/hbg_hw.c | 3 +++
>> drivers/net/ethernet/hisilicon/hibmcge/hbg_irq.c | 1 +
>> drivers/net/ethernet/hisilicon/hibmcge/hbg_mdio.c | 1 -
>> 5 files changed, 11 insertions(+), 5 deletions(-)
>>
^ permalink raw reply [flat|nested] 14+ messages in thread