netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net 0/2] There are some bugfix for the HNS3 ethernet driver
@ 2024-06-05  7:20 Jijie Shao
  2024-06-05  7:20 ` [PATCH net 1/2] net: hns3: fix kernel crash problem in concurrent scenario Jijie Shao
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Jijie Shao @ 2024-06-05  7:20 UTC (permalink / raw)
  To: yisen.zhuang, salil.mehta, davem, edumazet, kuba, pabeni, horms
  Cc: shenjian15, wangjie125, liuyonglong, shaojijie, chenhao418,
	netdev, linux-kernel

There are some bugfix for the HNS3 ethernet driver

Jie Wang (1):
  net: hns3: add cond_resched() to hns3 ring buffer init process

Yonglong Liu (1):
  net: hns3: fix kernel crash problem in concurrent scenario

 .../net/ethernet/hisilicon/hns3/hns3_enet.c   |  4 ++++
 .../net/ethernet/hisilicon/hns3/hns3_enet.h   |  2 ++
 .../hisilicon/hns3/hns3pf/hclge_main.c        | 21 ++++++++++++++-----
 3 files changed, 22 insertions(+), 5 deletions(-)

-- 
2.30.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH net 1/2] net: hns3: fix kernel crash problem in concurrent scenario
  2024-06-05  7:20 [PATCH net 0/2] There are some bugfix for the HNS3 ethernet driver Jijie Shao
@ 2024-06-05  7:20 ` Jijie Shao
  2024-06-06 16:26   ` Simon Horman
  2024-06-05  7:20 ` [PATCH net 2/2] net: hns3: add cond_resched() to hns3 ring buffer init process Jijie Shao
  2024-06-07 11:30 ` [PATCH net 0/2] There are some bugfix for the HNS3 ethernet driver patchwork-bot+netdevbpf
  2 siblings, 1 reply; 6+ messages in thread
From: Jijie Shao @ 2024-06-05  7:20 UTC (permalink / raw)
  To: yisen.zhuang, salil.mehta, davem, edumazet, kuba, pabeni, horms
  Cc: shenjian15, wangjie125, liuyonglong, shaojijie, chenhao418,
	netdev, linux-kernel

From: Yonglong Liu <liuyonglong@huawei.com>

When link status change, the nic driver need to notify the roce
driver to handle this event, but at this time, the roce driver
may uninit, then cause kernel crash.

To fix the problem, when link status change, need to check
whether the roce registered, and when uninit, need to wait link
update finish.

Fixes: 45e92b7e4e27 ("net: hns3: add calling roce callback function when link status change")
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
---
 .../hisilicon/hns3/hns3pf/hclge_main.c        | 21 ++++++++++++++-----
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index 43cc6ee4d87d..82574ce0194f 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -3086,9 +3086,7 @@ static void hclge_push_link_status(struct hclge_dev *hdev)
 
 static void hclge_update_link_status(struct hclge_dev *hdev)
 {
-	struct hnae3_handle *rhandle = &hdev->vport[0].roce;
 	struct hnae3_handle *handle = &hdev->vport[0].nic;
-	struct hnae3_client *rclient = hdev->roce_client;
 	struct hnae3_client *client = hdev->nic_client;
 	int state;
 	int ret;
@@ -3112,8 +3110,15 @@ static void hclge_update_link_status(struct hclge_dev *hdev)
 
 		client->ops->link_status_change(handle, state);
 		hclge_config_mac_tnl_int(hdev, state);
-		if (rclient && rclient->ops->link_status_change)
-			rclient->ops->link_status_change(rhandle, state);
+
+		if (test_bit(HCLGE_STATE_ROCE_REGISTERED, &hdev->state)) {
+			struct hnae3_handle *rhandle = &hdev->vport[0].roce;
+			struct hnae3_client *rclient = hdev->roce_client;
+
+			if (rclient && rclient->ops->link_status_change)
+				rclient->ops->link_status_change(rhandle,
+								 state);
+		}
 
 		hclge_push_link_status(hdev);
 	}
@@ -11319,6 +11324,12 @@ static int hclge_init_client_instance(struct hnae3_client *client,
 	return ret;
 }
 
+static bool hclge_uninit_need_wait(struct hclge_dev *hdev)
+{
+	return test_bit(HCLGE_STATE_RST_HANDLING, &hdev->state) ||
+	       test_bit(HCLGE_STATE_LINK_UPDATING, &hdev->state);
+}
+
 static void hclge_uninit_client_instance(struct hnae3_client *client,
 					 struct hnae3_ae_dev *ae_dev)
 {
@@ -11327,7 +11338,7 @@ static void hclge_uninit_client_instance(struct hnae3_client *client,
 
 	if (hdev->roce_client) {
 		clear_bit(HCLGE_STATE_ROCE_REGISTERED, &hdev->state);
-		while (test_bit(HCLGE_STATE_RST_HANDLING, &hdev->state))
+		while (hclge_uninit_need_wait(hdev))
 			msleep(HCLGE_WAIT_RESET_DONE);
 
 		hdev->roce_client->ops->uninit_instance(&vport->roce, 0);
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH net 2/2] net: hns3: add cond_resched() to hns3 ring buffer init process
  2024-06-05  7:20 [PATCH net 0/2] There are some bugfix for the HNS3 ethernet driver Jijie Shao
  2024-06-05  7:20 ` [PATCH net 1/2] net: hns3: fix kernel crash problem in concurrent scenario Jijie Shao
@ 2024-06-05  7:20 ` Jijie Shao
  2024-06-06 16:27   ` Simon Horman
  2024-06-07 11:30 ` [PATCH net 0/2] There are some bugfix for the HNS3 ethernet driver patchwork-bot+netdevbpf
  2 siblings, 1 reply; 6+ messages in thread
From: Jijie Shao @ 2024-06-05  7:20 UTC (permalink / raw)
  To: yisen.zhuang, salil.mehta, davem, edumazet, kuba, pabeni, horms
  Cc: shenjian15, wangjie125, liuyonglong, shaojijie, chenhao418,
	netdev, linux-kernel

From: Jie Wang <wangjie125@huawei.com>

Currently hns3 ring buffer init process would hold cpu too long with big
Tx/Rx ring depth. This could cause soft lockup.

So this patch adds cond_resched() to the process. Then cpu can break to
run other tasks instead of busy looping.

Fixes: a723fb8efe29 ("net: hns3: refine for set ring parameters")
Signed-off-by: Jie Wang <wangjie125@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
---
 drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 4 ++++
 drivers/net/ethernet/hisilicon/hns3/hns3_enet.h | 2 ++
 2 files changed, 6 insertions(+)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
index ff71fb1eced9..a5fc0209d628 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
@@ -3535,6 +3535,9 @@ static int hns3_alloc_ring_buffers(struct hns3_enet_ring *ring)
 		ret = hns3_alloc_and_attach_buffer(ring, i);
 		if (ret)
 			goto out_buffer_fail;
+
+		if (!(i % HNS3_RESCHED_BD_NUM))
+			cond_resched();
 	}
 
 	return 0;
@@ -5107,6 +5110,7 @@ int hns3_init_all_ring(struct hns3_nic_priv *priv)
 		}
 
 		u64_stats_init(&priv->ring[i].syncp);
+		cond_resched();
 	}
 
 	return 0;
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h
index acd756b0c7c9..d36c4ed16d8d 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h
@@ -214,6 +214,8 @@ enum hns3_nic_state {
 #define HNS3_CQ_MODE_EQE			1U
 #define HNS3_CQ_MODE_CQE			0U
 
+#define HNS3_RESCHED_BD_NUM			1024
+
 enum hns3_pkt_l2t_type {
 	HNS3_L2_TYPE_UNICAST,
 	HNS3_L2_TYPE_MULTICAST,
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH net 1/2] net: hns3: fix kernel crash problem in concurrent scenario
  2024-06-05  7:20 ` [PATCH net 1/2] net: hns3: fix kernel crash problem in concurrent scenario Jijie Shao
@ 2024-06-06 16:26   ` Simon Horman
  0 siblings, 0 replies; 6+ messages in thread
From: Simon Horman @ 2024-06-06 16:26 UTC (permalink / raw)
  To: Jijie Shao
  Cc: yisen.zhuang, salil.mehta, davem, edumazet, kuba, pabeni,
	shenjian15, wangjie125, liuyonglong, chenhao418, netdev,
	linux-kernel

On Wed, Jun 05, 2024 at 03:20:57PM +0800, Jijie Shao wrote:
> From: Yonglong Liu <liuyonglong@huawei.com>
> 
> When link status change, the nic driver need to notify the roce
> driver to handle this event, but at this time, the roce driver
> may uninit, then cause kernel crash.
> 
> To fix the problem, when link status change, need to check
> whether the roce registered, and when uninit, need to wait link
> update finish.
> 
> Fixes: 45e92b7e4e27 ("net: hns3: add calling roce callback function when link status change")
> Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
> Signed-off-by: Jijie Shao <shaojijie@huawei.com>

Reviewed-by: Simon Horman <horms@kernel.org>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net 2/2] net: hns3: add cond_resched() to hns3 ring buffer init process
  2024-06-05  7:20 ` [PATCH net 2/2] net: hns3: add cond_resched() to hns3 ring buffer init process Jijie Shao
@ 2024-06-06 16:27   ` Simon Horman
  0 siblings, 0 replies; 6+ messages in thread
From: Simon Horman @ 2024-06-06 16:27 UTC (permalink / raw)
  To: Jijie Shao
  Cc: yisen.zhuang, salil.mehta, davem, edumazet, kuba, pabeni,
	shenjian15, wangjie125, liuyonglong, chenhao418, netdev,
	linux-kernel

On Wed, Jun 05, 2024 at 03:20:58PM +0800, Jijie Shao wrote:
> From: Jie Wang <wangjie125@huawei.com>
> 
> Currently hns3 ring buffer init process would hold cpu too long with big
> Tx/Rx ring depth. This could cause soft lockup.
> 
> So this patch adds cond_resched() to the process. Then cpu can break to
> run other tasks instead of busy looping.
> 
> Fixes: a723fb8efe29 ("net: hns3: refine for set ring parameters")
> Signed-off-by: Jie Wang <wangjie125@huawei.com>
> Signed-off-by: Jijie Shao <shaojijie@huawei.com>

Reviewed-by: Simon Horman <horms@kernel.org>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net 0/2] There are some bugfix for the HNS3 ethernet driver
  2024-06-05  7:20 [PATCH net 0/2] There are some bugfix for the HNS3 ethernet driver Jijie Shao
  2024-06-05  7:20 ` [PATCH net 1/2] net: hns3: fix kernel crash problem in concurrent scenario Jijie Shao
  2024-06-05  7:20 ` [PATCH net 2/2] net: hns3: add cond_resched() to hns3 ring buffer init process Jijie Shao
@ 2024-06-07 11:30 ` patchwork-bot+netdevbpf
  2 siblings, 0 replies; 6+ messages in thread
From: patchwork-bot+netdevbpf @ 2024-06-07 11:30 UTC (permalink / raw)
  To: Jijie Shao
  Cc: yisen.zhuang, salil.mehta, davem, edumazet, kuba, pabeni, horms,
	shenjian15, wangjie125, liuyonglong, chenhao418, netdev,
	linux-kernel

Hello:

This series was applied to netdev/net.git (main)
by David S. Miller <davem@davemloft.net>:

On Wed, 5 Jun 2024 15:20:56 +0800 you wrote:
> There are some bugfix for the HNS3 ethernet driver
> 
> Jie Wang (1):
>   net: hns3: add cond_resched() to hns3 ring buffer init process
> 
> Yonglong Liu (1):
>   net: hns3: fix kernel crash problem in concurrent scenario
> 
> [...]

Here is the summary with links:
  - [net,1/2] net: hns3: fix kernel crash problem in concurrent scenario
    https://git.kernel.org/netdev/net/c/12cda920212a
  - [net,2/2] net: hns3: add cond_resched() to hns3 ring buffer init process
    https://git.kernel.org/netdev/net/c/968fde83841a

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-06-07 11:30 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-05  7:20 [PATCH net 0/2] There are some bugfix for the HNS3 ethernet driver Jijie Shao
2024-06-05  7:20 ` [PATCH net 1/2] net: hns3: fix kernel crash problem in concurrent scenario Jijie Shao
2024-06-06 16:26   ` Simon Horman
2024-06-05  7:20 ` [PATCH net 2/2] net: hns3: add cond_resched() to hns3 ring buffer init process Jijie Shao
2024-06-06 16:27   ` Simon Horman
2024-06-07 11:30 ` [PATCH net 0/2] There are some bugfix for the HNS3 ethernet driver patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).