* [PATCH v2] net/mlx5: Fix variable not being completed when function returns
@ 2025-01-08 3:00 Chenguang Zhao
2025-01-09 13:25 ` Tariq Toukan
2025-01-09 16:30 ` patchwork-bot+netdevbpf
0 siblings, 2 replies; 4+ messages in thread
From: Chenguang Zhao @ 2025-01-08 3:00 UTC (permalink / raw)
To: Saeed Mahameed, Leon Romanovsky, Tariq Toukan, Andrew Lunn,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Moshe Shemesh
Cc: Chenguang Zhao, netdev, linux-rdma
The cmd_work_handler function returns from the child function
cmd_alloc_index because the allocate command entry fails,
Before returning, there is no complete ent->slotted.
The patch fixes it.
mlx5_core 0000:01:00.0: cmd_work_handler:877:(pid 3880418): failed to allocate command entry
INFO: task kworker/13:2:4055883 blocked for more than 120 seconds.
Not tainted 4.19.90-25.44.v2101.ky10.aarch64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/13:2 D 0 4055883 2 0x00000228
Workqueue: events mlx5e_tx_dim_work [mlx5_core]
Call trace:
__switch_to+0xe8/0x150
__schedule+0x2a8/0x9b8
schedule+0x2c/0x88
schedule_timeout+0x204/0x478
wait_for_common+0x154/0x250
wait_for_completion+0x28/0x38
cmd_exec+0x7a0/0xa00 [mlx5_core]
mlx5_cmd_exec+0x54/0x80 [mlx5_core]
mlx5_core_modify_cq+0x6c/0x80 [mlx5_core]
mlx5_core_modify_cq_moderation+0xa0/0xb8 [mlx5_core]
mlx5e_tx_dim_work+0x54/0x68 [mlx5_core]
process_one_work+0x1b0/0x448
worker_thread+0x54/0x468
kthread+0x134/0x138
ret_from_fork+0x10/0x18
Fixes: 485d65e13571 ("net/mlx5: Add a timeout to acquire the command queue semaphore")
Signed-off-by: Chenguang Zhao zhaochenguang@kylinos.cn
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
---
v2:
add Fixes tag and Reviewed-by
---
drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
index 6bd8a18e3af3..e733b81e18a2 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
@@ -1013,6 +1013,7 @@ static void cmd_work_handler(struct work_struct *work)
complete(&ent->done);
}
up(&cmd->vars.sem);
+ complete(&ent->slotted);
return;
}
} else {
--
2.25.1
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [PATCH v2] net/mlx5: Fix variable not being completed when function returns
2025-01-08 3:00 [PATCH v2] net/mlx5: Fix variable not being completed when function returns Chenguang Zhao
@ 2025-01-09 13:25 ` Tariq Toukan
2025-01-09 16:29 ` Jakub Kicinski
2025-01-09 16:30 ` patchwork-bot+netdevbpf
1 sibling, 1 reply; 4+ messages in thread
From: Tariq Toukan @ 2025-01-09 13:25 UTC (permalink / raw)
To: Chenguang Zhao, Saeed Mahameed, Leon Romanovsky, Tariq Toukan,
Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Moshe Shemesh
Cc: netdev, linux-rdma
On 08/01/2025 5:00, Chenguang Zhao wrote:
> The cmd_work_handler function returns from the child function
> cmd_alloc_index because the allocate command entry fails,
> Before returning, there is no complete ent->slotted.
>
> The patch fixes it.
>
Unnecessary indentation.
> mlx5_core 0000:01:00.0: cmd_work_handler:877:(pid 3880418): failed to allocate command entry
> INFO: task kworker/13:2:4055883 blocked for more than 120 seconds.
> Not tainted 4.19.90-25.44.v2101.ky10.aarch64 #1
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> kworker/13:2 D 0 4055883 2 0x00000228
> Workqueue: events mlx5e_tx_dim_work [mlx5_core]
> Call trace:
> __switch_to+0xe8/0x150
> __schedule+0x2a8/0x9b8
> schedule+0x2c/0x88
> schedule_timeout+0x204/0x478
> wait_for_common+0x154/0x250
> wait_for_completion+0x28/0x38
> cmd_exec+0x7a0/0xa00 [mlx5_core]
> mlx5_cmd_exec+0x54/0x80 [mlx5_core]
> mlx5_core_modify_cq+0x6c/0x80 [mlx5_core]
> mlx5_core_modify_cq_moderation+0xa0/0xb8 [mlx5_core]
> mlx5e_tx_dim_work+0x54/0x68 [mlx5_core]
> process_one_work+0x1b0/0x448
> worker_thread+0x54/0x468
> kthread+0x134/0x138
> ret_from_fork+0x10/0x18
>
> Fixes: 485d65e13571 ("net/mlx5: Add a timeout to acquire the command queue semaphore")
Also for the Fixes tag.
Other than that:
Acked-by: Tariq Toukan <tariqt@nvidia.com>
>
> Signed-off-by: Chenguang Zhao zhaochenguang@kylinos.cn
> Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
> ---
> v2:
> add Fixes tag and Reviewed-by
> ---
> drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
> index 6bd8a18e3af3..e733b81e18a2 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
> @@ -1013,6 +1013,7 @@ static void cmd_work_handler(struct work_struct *work)
> complete(&ent->done);
> }
> up(&cmd->vars.sem);
> + complete(&ent->slotted);
> return;
> }
> } else {
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH v2] net/mlx5: Fix variable not being completed when function returns
2025-01-09 13:25 ` Tariq Toukan
@ 2025-01-09 16:29 ` Jakub Kicinski
0 siblings, 0 replies; 4+ messages in thread
From: Jakub Kicinski @ 2025-01-09 16:29 UTC (permalink / raw)
To: Tariq Toukan
Cc: Chenguang Zhao, Saeed Mahameed, Leon Romanovsky, Tariq Toukan,
Andrew Lunn, David S. Miller, Eric Dumazet, Paolo Abeni,
Moshe Shemesh, netdev, linux-rdma
On Thu, 9 Jan 2025 15:25:36 +0200 Tariq Toukan wrote:
> > mlx5_core 0000:01:00.0: cmd_work_handler:877:(pid 3880418): failed to allocate command entry
> > INFO: task kworker/13:2:4055883 blocked for more than 120 seconds.
> > Not tainted 4.19.90-25.44.v2101.ky10.aarch64 #1
> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > kworker/13:2 D 0 4055883 2 0x00000228
> > Workqueue: events mlx5e_tx_dim_work [mlx5_core]
> > Call trace:
> > __switch_to+0xe8/0x150
> > __schedule+0x2a8/0x9b8
> > schedule+0x2c/0x88
> > schedule_timeout+0x204/0x478
> > wait_for_common+0x154/0x250
> > wait_for_completion+0x28/0x38
> > cmd_exec+0x7a0/0xa00 [mlx5_core]
> > mlx5_cmd_exec+0x54/0x80 [mlx5_core]
> > mlx5_core_modify_cq+0x6c/0x80 [mlx5_core]
> > mlx5_core_modify_cq_moderation+0xa0/0xb8 [mlx5_core]
> > mlx5e_tx_dim_work+0x54/0x68 [mlx5_core]
> > process_one_work+0x1b0/0x448
> > worker_thread+0x54/0x468
> > kthread+0x134/0x138
> > ret_from_fork+0x10/0x18
> >
> > Fixes: 485d65e13571 ("net/mlx5: Add a timeout to acquire the command queue semaphore")
>
> Also for the Fixes tag.
>
> Other than that:
> Acked-by: Tariq Toukan <tariqt@nvidia.com>
rewritten the commit message and applied, thanks!
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v2] net/mlx5: Fix variable not being completed when function returns
2025-01-08 3:00 [PATCH v2] net/mlx5: Fix variable not being completed when function returns Chenguang Zhao
2025-01-09 13:25 ` Tariq Toukan
@ 2025-01-09 16:30 ` patchwork-bot+netdevbpf
1 sibling, 0 replies; 4+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-01-09 16:30 UTC (permalink / raw)
To: Chenguang Zhao
Cc: saeedm, leon, tariqt, andrew+netdev, davem, edumazet, kuba,
pabeni, moshe, netdev, linux-rdma
Hello:
This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Wed, 8 Jan 2025 11:00:09 +0800 you wrote:
> The cmd_work_handler function returns from the child function
> cmd_alloc_index because the allocate command entry fails,
> Before returning, there is no complete ent->slotted.
>
> The patch fixes it.
>
> mlx5_core 0000:01:00.0: cmd_work_handler:877:(pid 3880418): failed to allocate command entry
> INFO: task kworker/13:2:4055883 blocked for more than 120 seconds.
> Not tainted 4.19.90-25.44.v2101.ky10.aarch64 #1
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> kworker/13:2 D 0 4055883 2 0x00000228
> Workqueue: events mlx5e_tx_dim_work [mlx5_core]
> Call trace:
> __switch_to+0xe8/0x150
> __schedule+0x2a8/0x9b8
> schedule+0x2c/0x88
> schedule_timeout+0x204/0x478
> wait_for_common+0x154/0x250
> wait_for_completion+0x28/0x38
> cmd_exec+0x7a0/0xa00 [mlx5_core]
> mlx5_cmd_exec+0x54/0x80 [mlx5_core]
> mlx5_core_modify_cq+0x6c/0x80 [mlx5_core]
> mlx5_core_modify_cq_moderation+0xa0/0xb8 [mlx5_core]
> mlx5e_tx_dim_work+0x54/0x68 [mlx5_core]
> process_one_work+0x1b0/0x448
> worker_thread+0x54/0x468
> kthread+0x134/0x138
> ret_from_fork+0x10/0x18
>
> [...]
Here is the summary with links:
- [v2] net/mlx5: Fix variable not being completed when function returns
https://git.kernel.org/netdev/net/c/0e2909c6bec9
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-01-09 16:30 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-08 3:00 [PATCH v2] net/mlx5: Fix variable not being completed when function returns Chenguang Zhao
2025-01-09 13:25 ` Tariq Toukan
2025-01-09 16:29 ` Jakub Kicinski
2025-01-09 16:30 ` patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).